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Abstract 

Million-transistor  processors  are  being  manufactured  today,  and  soon  it  will 
be  possible  to  put  several  million  transistors  on  one  integrated  circuit.  While 
memory  applications  of  this  technology  are  clear,  it  is  not  obvious  how  best  to 
use  it  for  computation  purposes.  One  possibility  is  the  architecture  of  the 
Message-Driven  Processor  (MDP),  which  consists  of  a  324-4-bit  CPU,  memory, 
and  a  network  interface  together  on  one  chip.  MDPs  can  be  connected  (h- 
rectly  to  each  other  to  form  a  65536-processor,  message-passing,  MEMD,  par¬ 
allel  computer,  the  J-Machine.  The  MDP’s  architecture  is  unusual  in  that  it 
provides  a  very  high  processing  power  to  memory  ratio. 

Concurrent  Smalltalk  is  the  primary  language  used  for  programming  the  J- 
Machine.  Concurrent  Smallt^k  is  the  the  language  of  choice  because  it  fits 
the  J-Machine’s  fine-grain,  message-passing  model  well.  This  thesis  de¬ 
scribes  Concurrent  Smalltalk  and  its  implementation  on  the  J-Machine,  in¬ 
cluding  the  Optimist  II  compiler  and  Cosmos  operating  system.  Optimist  II 
can  perform  global  optimization  of  programs,  including  inline  function  expan¬ 
sion,  type  inference,  and  global  evaluation  of  constant  expressions.  Next, 
Cosmos  and  the  Concurrent  Smalltalk  nmtime  environment  are  described. 
Finally,  some  quantitative  and  qualitative  results  are  presented.  The  grain 
size  (the  average  amoimt  of  time  a  method  executes  before  suspending)  was 
found  to  be  about  60  instructions,  and  the  MDP  was  found  to  execute  one  in¬ 
struction  every  two  or  four  cycles,  depending  on  whether  external  DRAM  is 
used.  A  number  of  qualitative  issues  are  described,  along  with  a  few  prelimi¬ 
nary  results  for  addressing  difficult  problems  such  as  controlling  parallelism. 
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Chapter  1.  Introduction 


Goals 

This  thesis  describes  the  Concurrent  Smalltalk  language  and  its  implementation  on  the  Mes¬ 
sage-Driven  Processor.  Concurrent  Smalltalk,  also  known  as  CST,  is  a  concurrent  version  of 
the  object-oriented  programming  language  Smalltalk  [20].  The  implementation  consists  of  a 
global,  optimizing  compiler  and  a  streamlined  operating  system  for  the  J-Machine. 

This  thesis  covers  quite  a  broad  scope  of  the  implementation  of  Concurrent  Smalltalk,  includ¬ 
ing  subjects  ranging  from  issues  in  parallel  programming  in  general  and  the  design  of  Con¬ 
current  Smalltalk  itself  to  some  of  the  fine  points  of  the  design  and  optimization  of  the  MDP 
architecture.  The  goal  of  the  thesis  is  to  demonstrate  a  working  implementation  of  Concur¬ 
rent  Smalltalk  on  the  Message-Driven  processor.  Although  the  implementation  is  not  yet 
complete,  it  does  provide  hooks  for  all  of  the  advertised  functionality  of  Concurrent  Smalltalk 
and  is  based  on  solid  ground.  Versions  of  the  implementation  are  running  on  recently  manu¬ 
factured  MDP  chips,  and  I  hope  that  the  programs  described  herein  will  survive  and  evolve 
for  the  next  five  years. 

Another  goal  of  this  thesis  was  to  discover  and,  whenever  possible,  fix  design  flaws  in  the 
MDP  architecture  and  language  specification  so  as  to  make  an  implementation  of  Concurrent 
Smalltalk  practical.  Several  errors  in  the  MDP  architecture  and  Concurrent  Smalltalk  were 
found,  as  well  as  numerous  bugs  in  the  simulation  tools  used  to  verify  the  hardware. 

The  next  section  gives  a  brief  overview  of  the  J-Machine  hardware  and  the  Concurrent 
Smalltalk  language.  It  is  followed  by  an  outline  of  the  software  bridging  the  gap  between 
Concurrent  Smalltalk  and  the  MDP  hardware — the  Optimist  II  compiler  and  the  Cosmos  op¬ 
erating  system.  The  relationship  of  this  work  to  others’  in  fine  grain  concurrent  computation 
is  then  described. 

Second  Edition 

This  work  was  originally  a  Master’s  thesis  completed  in  May  1989.  It  has  been  updated  for 
the  state  of  Optimist  II  compiler.  Cosmos  operating  system,  and  MDPSim  7.0  simulator  as  of 
the  end  of  May  1991.  The  Optimist  II  compiler  now  produces  better  code,  and  several  Cos¬ 
mos  routines,  especially  the  CFUT  fault  handler,  have  been  sped  up.  Furthermore,  Cosmos 
has  been  updated  for  a  few  minor  architectural  revisions. 

The  compiler  and  operating  system  have  been  evolving  rapidly  in  the  past  few  months  due  to 
the  recent  availability  of  MDP  chips.  "This  document  does  not  include  these  newest  changes, 
which  include  support  for  hardware  I/O,  debugging  aids,  and  workarounds  for  first-silicon 
chip  bugs,  as  they  have  little  effect  on  the  ideas  in  this  work.  Other  members  of  the  Concur¬ 
rent  ViSl  Architecture  group,  including  Scott  Furman,  Rich  Lethin,  Todd  Dampier,  Shaun 
Kaneshiro,  John  Keen,  and  Mike  Noakes,  are  now  working  on  CST  applications  and  Cosmos 
enhancements  such  as  floating-point  arithmetic,  queue  overflow  handling,  and  garbage 
collection.  These  will  be  published  in  separate  documents  as  they  are  completed. 
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1.1.  Hardware  and  Software  Architecture 


The  J-Machine 

Million-transistor  processors  being  manufactured  today,  and  soon  it  will  be  possible  to  put 
several  million  transistors  on  one  integrated  circuit.  While  memory  applications  of  this 
technology  are  clear,  it  is  not  obvious  how  best  to  use  it  for  computation  purposes.  One  pos¬ 
sibility  is  the  architecture  of  the  Message-Driven  Processor  (MDP),  which  consists  of  a  32+4- 
bit’  CPU,  memory,  and  a  network  interface  together  on  one  chip.  MDPs  can  be  connected 
directly  to  each  other  to  form  a  65536-processor,  message-passing,  MIMD,  parallel  computer, 
the  J-Machine  [14].  The  network  is  a  three-dimensional  mesh  fast  enough  to  provide  com¬ 
munication  between  the  farthest  pair  of  processors  on  a  65536-processor  J-Machine  in  a  few 
microseconds — on  an  unloaded  network  an  8-word  message  can  be  transmitted  from  one 
comer  of  the  J-Machine  to  the  other  in  just  4  microseconds.  The  processors  are  optimized  for 
sending  and  receiving  messages;  a  processor  can  be  working  on  a  message  even  before  the 
entire  message  has  arrived.  The  MDPs  architecture  is  unusual  in  that  it  provides  a  very 
high  processing  power  to  memory  ratio. 

The  Message-Driven  Processor 

The  MDP  has  a  register-based  architecture  and  operates  on  32-bit  data  words  with  4-bit  tags. 
Tags  are  essential  in  efficiently  supporting  late  binding  for  object-oriented  languages  such  as 
Concurrent  Smalltalk.  In  addition,  tags  are  necessary  for  garbage  collection  and  valuable  for 
debugging  programs. 

The  MDP  is  message-based.  In  its  normal  mode  of  operation,  the  MDP  listens  on  the  net¬ 
work  for  messages.  When  it  receives  a  message  from  the  network,  it  stores  the  message  in  a 
FIFO  input  message  queue  and  dispatches  on  the  address  given  in  the  first  word  of  the  mes¬ 
sage.  Messages  are  used  for  all  communication  tasks,  including  function  and  method  calls, 
replies,  object  transfers,  and  other  synchronization  facilities. 

A  detailed  but  slightly  obsolete  description  of  the  MDP  architecture  is  in  [16];  a  updated 
summary  is  presented  in  Appendix  D.  MDPSim  [24]  [25]  is  an  instruction  level  simuls'  rr, 
assembler,  and  debugger  used  to  run  MDP  assembly  language  programs  and  test  the  operat¬ 
ing  system. 

Concurrent  Smalltalk 

Concurrent  Smalltalk  is  the  primary  language  used  to  program  the  J-Machine.  One  of  the 
main  goals  of  designing  Concurrent  Smalltalk  was  to  take  advantage  of  the  J-Machine’s 
unique  features.  A  new  software  architecture  was  needed  that  would  efficiently  support  fine- 
grain,  message-passing  computation.  Whereas  some  existing  parallel  computers  have  mes¬ 
sage  routing  times  measured  in  milliseconds,  the  routing  time  for  a  message  sent  from  one 
end  of  even  a  large  J-Machine  to  another  is  on  the  order  of  several  microseconds.  Operating 
system  overhead  on  processing  and  dispatching  that  message  of  more  than  a  few  microsec¬ 
onds  is  not  acceptable. 

Concurrent  Smalltalk  introduces  concurrency  to  standard  Smalltalk  by  evaluating  argu¬ 
ments  to  method  calls  in  parallel  as  well  as  allowing  the  computation  of  the  value  of  a  vari¬ 
able  to  proceed  in  parallel  with  the  other  computations  of  a  method  until  the  variable’s  value 
is  actually  needed.  Furthermore,  Concurrent  Smalltalk  adds  distributed  objects  to  Smalltalk. 
A  distributed  object  is  an  object  that  can  process  many  methods  at  the  same  time  without 
any  serialization  bottlenecks  other  than  those  required  by  the  algorithm  in  use.  Although 


’Each  word  consista  of  32  bits  of  data  and  a  4-bit  tag. 
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standard  objects  can  also  process  several  methods  simultaneously,  they  can  only  dispatch  on 
one  method  at  a  time^ 

Concurrent  Smalltalk  is  an  ideal  language  for  programming  the  J-Machine  because  it  is  easy 
to  parallelize  and  yields  small,  fine-grain  methods  as  well  as  a  considerable  amount  of  flexi¬ 
bility  in  the  system  software  implementation.  The  methods  dealing  with  a  particular  class 
can  travel  to  the  data  object  as  opposed  to  the  data  traveling  to  the  code.  Concurrent 
Smalltalk  also  provides  excellent  facilities  for  creating  data  abstractions;  the  Optimist  II 
compiler  amplifies  this  power  by  providing  global  optimizations  so  performance  does  not  suf¬ 
fer  l^cause  abstractions  are  used. 

Another  advantage  of  Concurrent  Smalltalk  is  that  it  is  low-level  enough  to  be  useful  in  im¬ 
plementing  parts  of  the  J-Machine  runtime  system,  while  being  at  a  level  high  enough  that 
the  programmer  does  not  have  to  worry  about  the  infamous  problems  of  parallel  process  syn¬ 
chronization  and  deadlocks.  In  fact,  once  the  data  structures  are  defined  properly,  pro¬ 
gramming  in  Concurrent  Smalltalk  feels  much  like  programming  in  a  standard  sequential 
language. 


'This  restriction  is  relaxed  for  immutable  standard  objects  because  they  may  be  copied  at  the  operating  system's 
discretion.  Nevertheless,  a  distributed  object  con  be  mutable  ond  still  have  no  synchronization  bottlenecks. 
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1.2.  Overview 


Foundations 

Some  of  the  pieces  comprising  the  Concurrent  Smalltalk  environment  were  available  before 
this  thesis  was  done.  A  primitive  compiler  was  available  [21],  as  were  a  description  of  the 
operating  system  kernel  [38],  several  descriptions  of  the  language  [13]  [21]  [17],  and  an  MDP 
assembly  language  simulator  (MDPSim  5.2)  [24].  Unfortunately,  none  of  the  pieces  really  fit 
together — the  various  versions  of  the  language  were  inconsistent,  the  output  of  the  compiler 
was  incompatible  with  the  untested  operating  system  kernel,  which  itself  was  written  for  an 
obsolete  version  of  the  MDP  architecture  [231. 

It  became  clear  that  it  would  be  easier  to  design  the  language,  the  compiler,  and  the  operat¬ 
ing  system  from  scratch  than  to  try  to  fit  the  existing  pieces  together.  Nevertheless,  the  ex¬ 
isting  code  and  ideas  were  useful  as  guides  to  which  approaches  would  likely  yield  good  re¬ 
sults  and  which  techniques  should  be  abandoned.  I  took  advantage  of  this  opportunity  to 
extend  Concurrent  Smalltalk  to  support  several  programming  styles  and  add  functions,  clo¬ 
sures,  continuations,  arrays,  nested  local  variables,  and  inline  classes  to  produce  a  language 
with  a  compact  implementation  yet  powerful  libraries.  The  new  features  did  not  complicate 
implementation;  in  fact,  by  providing  a  small  set  of  fundamental  primitives,  the  new  features 
often  simplified  the  implementation  of  existing  functionality,  a  phenomenon  noticed  in  the 
design  of  the  Scheme  language  [31]  [1]. 

The  contributions  of  this  thesis  include: 

•  A  redesign  of  the  Concurrent  Smalltalk  language. 

•  Optimist  II,  a  new  Concurrent  Smalltalk  compiler  and  interpreter. 

•  Cosmos  (Concurrent  Smalltalk  Operating  System),  an  operating  system  that  supports 
Concurrent  Smalltalk  on  the  MDP. 

•  Runtime  libraries  for  Concurrent  Smalltalk. 

•  Modifications  to  MDPSim,  the  MDP  assembler/simulator,  to  facilitate  downloading  pro¬ 
grams,  simplify  debugging,  and  collect  performance  measurements. 

•  Modifications  to  the  MDP  architecture  that  make  it  more  suitable  for  Concurrent 
Smalltalk. 

I  am  indebted  to  Scott  Wills  and  Andrew  Chien  for  helping  with  the  redesign  of  the  Concur¬ 
rent  Smalltalk  language,  and  Richard  Lethin,  John  Keen,  and  Stuart  Fiske  for  helping  with 
the  MDP  architecture  changes.  Professor  William  Dally  supervised  the  project. 

System  Overview 

The  Optimist  II  Compiler 

The  Optimist  II  compiler  continues  in  the  tradition  of  the  Optimist  compiler  by  compiling 
Concurrent  Smalltalk  to  assembly  code  that  is  as  small  as  possible  without  sacrificing  speed. 
In  addition.  Optimist  II  contains  an  interactive  Concurrent  Smalltalk  interpreter  that  is 
useful  for  prototyping  and  debugging  Concurrent  Smalltalk  programs  at  the  source  level. 
Optimist  II  is  also  a  platform  for  experimenting  with  compiler  optimizations.  Global  opti¬ 
mizations  such  as  function  inlining  and  the  reduction  of  method  calls  to  function  calls  were 
added  and  found  to  be  highly  successful. 
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The  compiler  itself  is  divided  into  several  phases,  which  are  described  in  more  detail  in 
Chapter  3.  It  produces  an  MDPSim  command  file  which  can  be  downloaded  into  MDPSim 
and  nin  on  a  simulated  J-Machine. 

Cosmos 

Cosmos  is  the  operating  system  used  on  the  Message-Driven  Processor  to  support  code  out¬ 
put  by  Optimist  II.  Many  of  the  ideas  in  Cosmos  are  borrowed  from  JOSS  [38]  written  by 
Brian  Totty — JOSS  introduced  the  concept  of  a  Birth/Residence  Address  Table  (BRAT)  and 
the  protocol  for  migrating  object  between  processors.  Nevertheless,  Cosmos’s  code  bears  lit¬ 
tle  resemblance  to  JOSS. 


Figure  1-1.  Software  Environment  Organization 

A  Concurrent  Smalltalk  program  can  be  either  compiled  or  interpreted  by  the  Optimist  II  compiler.  Interpretation  is 
useful  to  debug  Concurrent  Smalltalk  programs  and  interactively  experiment  with  language  features.  When  a 
Concurrent  Smalltalk  program  is  compiled,  it  is  loaded  into  MDPSim,  a  J-Machine  simulator,  together  with  the 
Cosmos  operating  system.  MDPSim  will  then  run  the  program  to  obtain  its  results  as  well  as  program  perfor¬ 
mance  statistics. 

The  main  goals  of  Cosmos  were  to  make  a  working  operating  system,  make  it  as  efficient  as 
possible,  and  make  it  simple,  all  subject  to  the  time  constraints  of  a  Master’s  thesis.  Those 
three  goals  have  been  achieved  to  a  large  extent,  in  that  the  operating  system  does  work,  and 
simple  programs  have  been  run  on  it.  Unfortunately,  controlling  a  large  parallel  computer  is 
a  difficult  task,  and  Cosmos  still  falls  short  in  many  ways  which  are  described  in  Chapter  8. 
In  particular,  higher-level  resource  management  and  load  balancing  issues  are  yet  to  be  ade- 
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quately  addressed.  Nevertheless,  Cosmos  is  a  good  start  and  a  platform  for  experimenting 
with  the  more  difficult  problems. 

Example 

A  very  simple  example  of  the  use  of  the  system  to  compile  and  run  a  factorial  program  is 
listed  below.  Please  refer  to  chapter  5  for  a  more  detailed  example  of  the  transformations  in 
the  compiler  and  Appendices  B  and  C  for  information  about  using  the  compiler  and  the  oper¬ 
ating  system. 

CST : (d*£un  fact  (n) 

(if  (On  1) 

1 

(*  n  (factorial  (-  n  1))))) 

#<Cst-Lambda  5090060  FACT> 

CST:  (fact  3) 

VOien  interpreting:  (FACT  3) 

Error:  Unbound  global  FACTORIAL 

>  Break: 

>  Type  Command-/  to  continue.  Command-,  to  abort. 

1  >  Continuing. . .Fatal  error:  Can't  apply  #<Nil> 

>  Break: 

>  Type  Command-/  to  continue,  Command-,  to  abort. 

1  >  Continuing... 

C5r;(da£ur.  f&ct  (n) 

(if  (On  1) 

1 

(*  n  (fact  (-  n  1))))) 

♦<Cst-Lambda  4920924  FACT> 

CST: (fact  4) 

#<Integer  24> 

CST :  (oonplla  fact  "NawFact .  ladp" ) 

Figure  1-2.  Compiling  Fact 

The  user  entered  a  factorial  function,  corrected  an  error  in  it,  tested  it  on  a  sample  input,  and  then  compiled  it  into 
MOP  assembly  code  in  the  NewFact.mdp  file.  The  user's  input  is  shown  in  bold. 

First  the  user  starts  the  compiler  and  enters  the  compiler’s  interactive  mode  (see  Appendix 
B)  as  shown  in  Figure  1-2.  He  enters  the  fact  function  and  runs  it  only  to  find  an  error — fact’s 
recursive  call  should  be  to  fact,  not  factorial.  The  user  corrects  the  error  and  then  uses  the 
compiler’s  interpreter  to  successfully  compute  the  factorial  of  4. 

Afterwards  the  user  compiles  fact  to  MDP  assembly  code,  quits  Optimist  II,  and  launches 
MDPSim,  where  he  loads  the  object  file,  and  calls  fact  on  4  to  get  the  correct  answer — 24 
(Figure  1-3).  The  stats  command  can  then  be  used  to  determine  some  running  statistics, 
such  as  the  frequencies  of  instructions  executed,  the  amount  of  parallelism  used,  and  the  to¬ 
tal  time  taken  to  run  the  program.  Starting  from  a  cold  start,  fact  takes  725  steps  on  a  2x2x1 
J-Machine  to  compute  its  answer. 

Implementation 

The  Optimist  II  compiler  is  written  in  CLOS  [271,  the  Common  Lisp  Object  System.  Except 
for  the  use  of  the  LOOP  iteration  macro  [7],  Optimist  II  adheres  to  standard  Common  Lisp  as 
specified  in  [35]  and  amended  in  [6]  and  in  the  amendments  specified  by  the  Common  Lisp 
Cleanup  Committee  that  were  available  at  the  time  of  this  writing.  'The  LOOP  macro  is  itself 
written  in  standard  Common  Lisp,  so  Optimist  should  run  on  any  machine  with  a  faithful 
implementation  of  Common  Lisp.  A  slightly  modified  version  of  the  12/7/88  version  of  Xe¬ 
rox’s  PCL  was  used  to  implement  a  subset  of  CLOS  before  Apple  Common  Lisp  2.0  became 
available. 
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MDPSin  -X  2  -y  2  -msiza  0x1000  :: Cosmos :Cosinoa.m  NawFsct.md^ 

Message-Driven  Processor  Simulator 
Version  7.0  Rev  B 

Accompanies  MDP  Architecture  Document  IIB 
Written  by  Waldemar  Horwat 

Architecture  Updates  by  Brian  Totty  and  Jerry  Larivee 
UROPs  for  Bill  Dally 

4  MDPs  present. 

@0. .3)MESSAGE  fact4 

Message ) MSG: msgi^plyFunction I  5 

Message } {fFact} 

Message) 4 
Message IIONODE 
Message )0 
Message )END 
@0..3)injact  £act463 
@  0 . . 3 ) rasatstabs 
@0. .3)run 

Tick  724  Received  priority  0  message: 

OBJ:$801BF704  u=l  f=0  of f set»=S006EC=Reply  length-$0004 

INT:$0000FC00  -  64512 

INT:$00000000  -  0 

INT:$00000018  -  24 

@0.  .3)stats 

725  ticks  executed. 

...  More  statistics  ... 

Figure  1-3.  Running  Fact 

The  user  loaded  the  tact  object  code  and  typed  a  few  magic  incantations  that  invoked  the  fact  function  on  the  input 
4  (the  third  word  in  the  injected  message).  The  result  24  (the  fourth  word  in  the  ejected  message)  was  returned 
after  725  steps  on  a  4-node  J-Machine.  Most  of  the  time  was  spent  distributing  the  tact  code  throughout  the  J-Ma- 
chine:  the  second  time  it  only  takes  498  steps  to  compute  the  answer  (some  code  is  still  being  distributed),  the 
third  time  takes  289  steps,  and  afterwards  the  execution  time  is  about  265  steps. 


Optimist  II  was  developed  on  a  Macintosh  using  Apple  Common  Lisp  1.2.2  and  2.0  written  by 
Coral  Software  Corp  (now  merged  with  Apple  Computer,  Inc.).  It  runs  on  a  5-megabyte  Mac¬ 
intosh  II,  although  8  megabytes  are  recommended  and  at  least  16  are  needed  to  run  Optimist 
II  and  MDPSim  simultaneously. 

Cosmos  is  written  in  MDP  assembly  language  [16].  MDPSim  [24]  [25]  was  used  as  an 
assembler  and  simulator  for  a  small  J-Machine. 

All  of  the  software  needed  to  compile  and  run  Concurrent  Smalltalk  programs  exists  on  both 
a  Macintosh  II  platform  and  on  Sun  workstations. 

Results 

The  primary  result  of  this  work  is  a  demonstration  of  a  working  implementation  of  Concur¬ 
rent  Smalltalk  on  a  J-Machine.  In  addition,  a  number  of  secondary  results  were  obtained. 
These  include  the  qualitative  and  quantitative  benefits  of  optimizations  in  the  Optimist  II 
compiler,  data  on  the  expected  grain  size  (the  number  of  instructions  executed  in  response  to 
a  message),  and  a  number  of  qualitative  observations  about  the  shortcomings  of  the  current 
system.  The  results  did  not  always  come  out  as  expected.  For  example,  the  finding  that  the 
grain  size  is  about  60  instructions  was  surprising;  it  was  expected  to  be  much  lower.  Code 
statistics  indicate  that  the  MDP  will  take  about  1.9  cycles  per  instruction,  although  most  in¬ 
structions  execute  in  1  cycle;  if  slow  external  DRAM  is  used  to  hold  user  programs  and  data, 
the  MDP  could  take  as  many  as  3.5  cycles  per  instruction.  Network  loading  calculations  indi¬ 
cate  that  network  congestion  will  become  a  concern  when  the  size  of  the  J-Machine  exceeds 
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343  nodes;  either  a  faster  network  or  some  means  of  exploiting  locality  will  be  needed  for 
larger  J-Machines. 

The  quantitative  results  are  listed  in  Chapter  7,  while  the  qualitative  ones  are  in  Chapter  8. 
Chapter  8  may  seem  a  little  pessimistic,  but  many  of  the  current  shortcomings  listed  there 
would  not  have  been  found  had  this  work  not  been  done;  furthermore,  the  current  implemen¬ 
tation  of  Cosmos  provides  a  great,  highly  accurate  platform  for  research  into  the  issues  pre¬ 
sented  there. 

Caveats 

Due  to  the  availability  of  only  a  finite  amount  of  time  for  writing  this  thesis,  which  could  po¬ 
tentially  involve  an  infinite  amount  of  work,  some  features  could  not  be  included  in  the  cur¬ 
rent  implementation  of  Concurrent  Smalltalk.  The  biggest  omission  is  the  lack  of  garbage 
collection — if  enough  storage  isn’t  reclaimed,  the  machine  will  fail.  Garbage  collection, 
although  interesting,  was  omitted  to  keep  this  project  to  a  reasonable  size — a  good  garbage 
collector  and  load  manager  would  require  more  effort  than  is  desirable  for  a  Master’s  thesis. 

Full  futures  were  also  not  implemented.  They  were  omitted  from  the  interpreter  in  the  com¬ 
piler  because  simulating  them  is  difficult  on  a  sequential  machine  in  a  sequential  language 
(Common  Lisp).  Futures  were  omitted  from  the  run-time  system  because  of  the  considerable 
amount  of  work  needed  to  implement  all  the  fault  handlers  and  special  cases  involved.  Ne¬ 
vertheless,  almost  all  Concurrent  Smalltalk  programs  still  attain  reasonable  parallelism 
through  the  use  of  efutures*,  which  are  fully  operational. 

Other  features  that  were  not  implemented  are  I/O  facilities  at  both  the  Optimist  II  and  Cos¬ 
mos  levels  and  runtime  support  for  local  (non-distributed)  arrays  and  floating  point  numbers. 
I/O  facilities,  while  useful,  do  not  contribute  much  to  the  project  and  are  easy  to  add  later. 
Local  arrays  and  floating  point  numbers  are  supported  by  the  Optimist  II  compiler  but  not 
the  runtime  system;  supporting  them  at  the  runtime  level  will  require  writing  MDP  assem¬ 
bly  language;  no  major  surprises  are  expected  there. 

Some  of  the  optional  features  of  Concurrent  Smalltalk  were  not  included  due  to  a  lack  of 
time.  All  class  inline  declarations  are  currently  ignored;  I  anticipate  that  it  will  be  possible 
to  inline  objects  inside  other  objects  sometime  in  the  future,  but  that  is  not  a  high  priority  at 
this  time.  The  omission  of  class  inlining  does  not  change  the  semantics  of  Concurrent 
Smalltalk  programs.  Function  inlining  is  more  useful,  and  it  does  work  now. 

Reading  Guide 

The  remainder  of  this  chapter  describes  related  work  in  fine-grain  concurrent  computation. 
The  succeeding  chapters  delve  into  various  aspects  of  the  system,  starting  from  the  top — 
Chapter  2,  Concurrent  Smalltalk,  provides  an  introduction  to  the  Concurrent  Smalltalk 
language  in  general.  Chapter  3,  The  Optimist  II  Compiler,  describes  the  Concurrent 
Smalltalk  compiler  and  interpreter.  Chapter  4,  The  Cosmos  Operating  System,  describes 
the  operating  system.  To  avoid  overlap,  the  compiler  features  documented  in  [21]  are  not 
documented  here;  thus,  it  might  be  helpful  to  consult  [21]  when  reading  Chapter  3. 

Chapter  5,  Sample  Program,  traces  the  progress  of  a  sample  program  from  the  Concurrent 
Smalltalk  source  level  down  to  object  code.  Chapter  6,  Debugging,  provides  some  debugging 
techniques  for  Concurrent  Smalltalk  and  MDP  programs.  Chapters  7,  Performance  Mea¬ 
surements,  and  8,  Future  Evolution,  present  the  results  of  this  work.  Chapter  7  contains 
quantitative  measurements  of  the  performance  of  Cosmos  and  the  compiled  code,  while 
(Chapter  8  describes  some  of  the  less  tangible,  qualitative  shortcomings  of  the  current  system 
and  ideas  for  correcting  them.  Chapter  9,  Conclusion,  concludes  the  main  body  of  the  the¬ 
sis. 


cfiiture,  also  called  a  context  future,  is  a  local  future  which  cannot  be  passed  outside  the  function  without  being 
touched  (i.e.  replaced  by  its  value). 
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The  appendices  parallel  the  main  chapters  with  more  detailed  information.  Appendix  A, 
Concurrent  Smalltalk  Reference,  is  the  most  important,  for  it  contains  the  specification 
of  Concurrent  Smalltalk.  Appendix  B,  Using  Optimist  n,  provides  a  detailed  description  of 
the  Optimist  II  features  not  listed  in  Appendix  A.  Similarly,  Appendix  C,  Using  Cosmos,  is 
a  guide  to  running  Cosmos  on  MDPSim;  the  latest  MDPSim  reference  manual  [25]  should 
also  be  consulted  when  running  Cosmos.  Appendix  D,  MDP  Architecture  Summary, 
summarizes  the  current  version  of  the  MDP  architecture.  Finally,  Appendix  F,  Cosmos 
Listing,  contains  a  listing  of  the  entire  operating  system. 

Since  this  thesis  also  serves  as  a  reference  manual  for  Concurrent  Smalltalk,  Chapter  2  and 
Appendices  A  and  B  have  been  indexed.  The  index  appears  at  the  end  of  the  thesis. 
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1.3.  Related  Work 

The  ideas  of  optimizing  Smalltalk  and  running  object-oriented  software  on  concurrent,  fine- 
grain  systems  are  not  new,  but  they  have  not  been  integrated  previously  to  the  extent  found 
on  the  J-Machine.  While  most  of  the  efforts  concentrated  on  either  optimizing  Smalltalk  for 
conventional  computers  or  developing  radically  new  programming  methodologies.  Concurrent 
Smalltalk  presents  a  somewhat  conventional  Smalltalk  environment  to  the  programmer 
(with  a  few  new  features  such  as  futures  and  distributed  objects),  which  is  at  the  same  time 
efficiently  implemented  on  a  fine-grain  parallel  computer. 

A  major  contribution  of  this  work  is  the  actual  optimized  implementation  of  Concurrent 
Smalltalk  on  an  assembly  language  architecture.  While  theoretical  studies  and  simulations 
in  higher-level  languages  can  yield  asymptotic  and  qualitative  results,  an  implementation 
yields  the  constant  factors  determining  a  system’s  performance.  These  performance  mea¬ 
surements  are  an  important  part  of  this  work,  as  they  indicate  the  relative  costs  of  the  primi¬ 
tive  operations  and  can  be  used  to  gauge  the  true  performance  of  a  concurrent  computer. 

Smalltalk  Systems 

Smalltalk-80 

Early  Smalltalk-80  optimization  efforts  such  as  [18]  concentrated  on  optimizing  Smalltalk 
within  the  constraints  of  the  byte  code  interpreter.  In  addition,  the  work  was  limited  by  the 
Smalltalk-80  constraints  of  making  contexts  and  methods  program-visible  data  structures, 
which  required  some  effort  to  convert  between  the  optimized  and  standardized  versions  of 
the  structures.  Several  context  optimizations  are  also  presented  in  [18],  including  determin¬ 
ing  which  contexts  which  can  be  referred  to  as  first-class  data  objects  and  which  contexts  can 
be  pointed  by  blocks.  Most  contexts  do  not  fall  into  either  category,  and  they  can  be  placed 
on  the  stack.  Such  optimizations  are  now  also  commonly  done  in  Lisp  compilers  [36]. 

Whereas  early  Smalltalk-80  implementations  were  constrained  to  compatibility  with  byte 
codes  and  were  run  on  stack  machines,  Concurrent  Smalltalk  is  bound  by  neither  constraint. 
The  formats  of  contexts  and  method  code  are  not  defined  in  the  language,  and  there  are  no 
portable  means  to  store  a  pointer  to  a  context  in  a  programmer-visible  variable.  Thus,  Opti¬ 
mist  II  and  Cosmos  can  use  the  most  efficient  format  for  a  context  or  even  several  different 
formats  if  they  so  desire.  Furthermore,  the  MDP  is  not  a  stack-based  machine,  so  there  are 
no  clear  advantages  to  determining  which  contexts  will  be  live  for  a  long  time.  Also,  contexts 
are  fully  self-contained,  so  a  closure  cannot  refer  to  a  context.  Finally,  several  techniques  are 
used  to  optimize  closures.  As  will  be  seen  in  Chapter  3,  when  a  closure  is  created,  either  the 
lexical  variables  are  copied  into  the  closure,  or  a  common  object  is  made  to  which  both  the 
context  and  the  closure  refer. 

Optimized  Sequential  Smalltalk 

A  few  years  later  it  became  clear  that  global  analysis  and  optimization  were  necessary  to  op¬ 
timize  Smalltalk  programs  further.  Optimizing  Smalltalk  well  required  an  ability  to  convert 
method  dispatches  into  more  efficient  function  calls,  which  led  rise  to  several  type  systems 
for  Smalltalk  [5]  [26].  When  a  type  system  could  be  applied  to  a  Smalltalk  program,  the 
compiler  could  optimize  it  by  a  factor  of  5  to  10  over  interpreted  Smalltalk.  The  main  com¬ 
piler  optimizations  of  TS  (261  are  similar  to  those  of  Optimist  II:  Both  TS  and  Optimist  II 
can  convert  a  message  send  into  a  case  statement  of  procedure  calls,  substitute  functions  in¬ 
line,  and  optimize  tail  recursion.  In  addition,  TS  can  beta-reduce  blocks,  which  Optimist  II 
currently  cannot  do.  On  the  other  hand,  Optimist  II  contains  a  number  of  other  powerful 
dataflow  optimizations  (see  Chapter  3  and  [21])  commonly  found  in  C  compilers,  which  make 
its  assembly  language  output  close  to  optimal.  Moreover,  Optimist  II  can  evaluate  large  con¬ 
stant  expressions  at  compile  time,  and  it  can  infer  types  of  variables,  allowing  it  to  produce 


10 


Chapter  1 


Introduction 


good  code  even  though  type  declarations  in  Concurrent  Smalltalk  are  completely  optional. 
TS,  on  the  other  hand,  has  difficulties  combining  typed  code  with  untyped  code. 

The  MDP  hardware  also  plays  an  important  role  in  making  Optimist  II  efficient.  By  provid¬ 
ing  tags  and  checking  them  on  primitive  operations,  the  MDP  architecture  frees  Optimist  II 
from  the  difficult  and  often  unrewarding  process  of  analyzing  programs  trying  to  determine 
information  such  as  whether  an  integer  variable  could  contain  a  large-integer  (an  integer 
which  does  not  fit  into  a  single  32-bit  word)  or  whether  the  arguments  to  +  are  known  to  be 
numbers.  Although  this  information  is  generally  difficult  to  determine,  in  most  cases 
integers  are  small  and  the  arguments  to  arithmetic  primitives  are  usually  numbers,  so 
hardware  tag-checking  is  the  right  approach  to  this  problem.  Thanks  to  the  MDP  hardware, 
even  if  Optimist  II  cannot  determine  the  type  of  some  expression,  performance  does  not 
suffer  too  much. 

CONCURR  ENTSM  ALLTALK 

A  recent  language  close  to  Concurrent  Smalltalk  and  having  an  almost  identical  name  is 
CONCURRENTS MALLTALK  [39]  [40]  independently  developed  by  Yasuhiko  Yokote  and  Mario 
Tokoro.  CONCURRENTSMALLTALK  shares  with  Concurrent  Smalltalk  the  cfuture  facility 
(called  a  CBox  in  CONCURRENTSMALLTALK)  and  the  ability  to  process  messages  asyn¬ 
chronously.  In  addition,  CONCURRENTSMALLTALK  defines  atomic  objects,  which  Concur¬ 
rent  Smalltalk  does  not  have  but  can  easily  emulate  using  locks.  On  the  other  hand.  Concur¬ 
rent  Smalltalk  includes  distributed  objects,  which  CONCURRENTSMALLTALK  does  not  pro¬ 
vide.  Furthermore,  the  implementation  of  Concurrent  Smalltalk  is  more  optimized. 
Whereas  CONCURRENTSMALLTALK  is  implemented  as  a  byte  code  interpreter.  Concurrent 
Smalltalk  compiles  to  assembly  language. 

The  two  languages  have  somewhat  different  flavors.  CONCURRENTSMALLTALK  is  very  close 
to  Smalltalk-80,  and  most  of  the  concurrent  features  are  add-ons  that  have  to  be  explicitly 
requested  by  the  programmer.  Concurrent  Smalltalk  makes  concurrency  the  default,  and 
the  programmer  has  to  explicitly  request  sequential  processing  if  he  wants  it.  At  the  same 
time,  the  MDP  hardware  assists  Concurrent  Smalltalk  by  making  the  use  of  concurrency 
very  cheap.  For  example,  a  hardware  tag  is  provided  that  implements  cfutures  in  Concur¬ 
rent  Smalltalk  using  much  less  overhead  than  cboxes  in  CONCURRENTSMALLTALK. 

In  [40]  several  changes  to  the  original  CONCURRENTSMALLTALK  are  discussed.  Blocks  are 
treated  differently  depending  on  whether  they  were  created  by  atomic  objects’  contexts  or  not. 
Concurrent  Smalltalk’s  model  of  only  having  one  kind  of  object  and  using  locks  where  neces¬ 
sary  to  make  atomic  transactions  does  not  lead  to  these  difficulties.  Finally,  secretary  objects 
were  introduced  to  CONCURRENTSMALLTALK  to  keep  track  of  which  threads  are  waiting  for 
a  resource.  An  equivalent  facility  is  used  internally  in  locks  in  Concurrent  Smalltalk. 

Actor  Systems 

Another  recent  development  in  object  oriented  programming  was  the  rise  in  actor  systems 
[2].  An  actor  system  is  a  programming  paradigm  in  which  simple  self-contained  entities 
called  actors  communicate  with  each  other  to  run  a  program.  Much  of  the  program’s  content 
is  held  in  the  interconnections  among  the  actors.  From  the  implementation  standpoint.  Con¬ 
current  Smalltalk  shares  many  of  the  ideas  with  actor  systems,  but  the  language  itself  is  not 
designed  exclusively  as  an  actor  language.  Instead,  Concurrent  Smalltalk  is  as  a  language 
closer  to  Smalltalk  and  Lisp,  but  it  is  possible  to  write  actor-like  programs  in  Concurrent 
Smalltalk  without  too  much  trouble. 

Cantor 

Cantor  (4|  is  both  a  programming  language  and  a  formalism  for  reasoning  about  the  prob¬ 
lems  that  arise  in  fine-grain,  message-passing  parallel  computers.  In  Cantor  each  object  (the 
Cantor  equivalent  of  a  Concurrent  Smalltalk  context)  can  only  perform  a  bounded  amount  of 
computation  on  receiving  a  message,  and  that  computation  is  atomic.  Also,  messages  sent 
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from  one  object  to  another  are  guaranteed  to  arrive  in  the  original  order.  Concurrent 
Smalltalk  is  similar  to  Cantor  at  the  implementation  level — when  a  message  is  sent  to  a  con¬ 
text,  it  performs  a  bounded  amount  of  computation*,  perhaps  sends  a  few  more  messages, 
and  then  either  suspends  or  waits  for  the  next  message.  The  state  of  a  compuu  Lion  is  com¬ 
posed  mostly  of  idle  objects  and  messages  traveling  between  objects,  with  only  a  few  objects 
executing.  Hence,  at  a  superficial  level,  a  Concurrent  Smalltalk  object  code  program  is  a 
Cantor  program.  Nevertheless,  the  Concurrent  Smalltalk  object  code  program  is  more  com¬ 
plicated  because  it  might  fault  while  performing  the  computation  of  the  next  state.  One  can 
view  this  possibility  as  either  computation  being  non-atomic  or  treating  faults  as  if  they  were 
message  sends  and  suspends,  preserving  the  Cantor  model.  Another  distinction  is  that  Con¬ 
current  Smalltalk  does  not  guarantee  that  messages  between  a  pair  of  objects  will  arrive  in 
the  order  in  which  they  were  sent. 

Probably  the  best  relationship  between  Concurrent  Smalltalk  and  Cantor  is  that  Concurrent 
Smalltalk  is  a  high-level  language  that  compiles  to  Cantor-like  object  code.  At  the  source 
level.  Concurrent  Smalltalk  frees  the  programmer  from  the  myriad  of  error-prone  synchro¬ 
nization  details  found  in  Cantor.  Concurrent  Smalltalk  encapsulates  the  Cantor  concept  of 
future  flow  into  a  few  easy-to-use  primitives  such  as  touch  and  nconcurrently.  At  the 
same  time.  Concurrent  Smalltalk  presents  the  appearance  of  global  and  nested  data  struc¬ 
tures  (such  as  lexical  scoping  of  local  variables)  which  are  compiled  into  interacting  objects. 

Nevertheless,  Cantor  is  a  good  theoretical  model  for  computation  on  the  J-Machine.  For  ex¬ 
ample,  the  load  balancing  and  management  results  in  [4]  are  expected  to  also  apply  to  the  J- 
Machine.  However,  the  J-Machine  can  also  suffer  from  problems  not  discusses  in  [4],  such  as 
having  too  much  parallelism.  Some  of  the  load  balancing  issues  are  presented  in  Chapter  8. 

Acore 

Acore  [30],  an  “actor  core  language  ”  is  another  recent  actor  language.  Like  Cantor,  it  pro¬ 
vides  an  environment  in  which  a  computation  is  done  by  interacting  actors  with  limited  abili¬ 
ties;  however,  actors  in  Acore  can  compute  arbitrary  functions  to  determine  state,  and  Acore 
has  a  notion  of  a  transaction  (a  message  send  and  a  reply),  which  greatly  simplifies  pro¬ 
gramming. 

Acore  and  Concurrent  Smalltalk  are  similar  in  many  ways.  Both  languages  implement  mes¬ 
sage  sends,  replies,  concurrent  evaluation  of  subexpressions,  local  variables,  static  scoping, 
and  instance  objects  (called  actors  in  Acore).  However,  there  are  also  a  few  differences.  Due 
to  its  Smalltalk-80  heritage,  Concurrent  Smalltalk  permits  local  variables  to  be  altered, 
while  Acore  does  not;  both  languages  allow  muL.'^tion  of  instance  variables.  In  addition, 
Acore  implements  a  sponsorship  mechanism  for  higher-order  control  of  the  course  of  a  com¬ 
putation  and  a  complaint  mechanism  for  handling  exceptions.  It  remains  to  be  seen  whether 
these  mechanisms  will  be  necessary  in  Concurrent  Smalltalk^. 

Acore  is  compiled  into  Pract,  which  is  a  form  of  an  actor  assembly  language,  whereas  Con¬ 
current  Smalltalk  is  compiled  into  MDP  assembly  language.  As  a  result  of  this  difference, 
some  actions  which  are  cheap  in  one  language  are  expensive  in  the  other,  which  affects  the 
language  design.  Actor  creation  is  very  cheap  in  Acore,  while  instance  object  creation,  mod¬ 
erately  expensive  in  Concurrent  Smalltalk,  is  avoided  whenever  possible.  On  the  other  hand, 
futures  are  fairly  expensive  in  Acore,  while  they  are  very  cheap  in  Concurrent  Smalltalk; 
thus,  Concurrent  Smalltalk  creates  a  future  (or  a  cheaper  cfuture)  as  a  result  of  every  non¬ 
primitive  function  call,  achieving  maximum  concurrency  within  a  method  in  most  cases. 
Acore,  on  the  other  hand,  often  has  to  do  a  relatively  expensive  join  operation.  For  the  same 


*A8  will  be  discussed  in  Chapters  5  and  10,  the  amount  of  computation  done  by  a  Concurrent  Smalltalk  process  on 
receiving  a  message  truly  is  bounded,  but  it  is  done  for  a  more  prosaic  reason  than  keeping  a  clean  model — user 
Concurrent  Smalltalk  methods  arc  not  allowed  to  loop  without  a  message  send  somewhere  to  break  the  loop  to  pre¬ 
vent  the  incoming  message  queues  on  an  MDP  from  overflowing  if  the  loop  lasts  for  a  long  time.  Also,  long,  indivis¬ 
ible  loops  would  degrate  latency  for  other  messages  that  are  waiting  in  an  MDP’s  incoming  message  queue. 
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reason,  futures  are  transparent  in  Concurrent  Smalltalk,  while  they  are  programmer-visible 
in  Acore*. 

The  two  languages  use  the  same  mechanism  for  calling  messages.  When  a  Concurrent 
Smalltalk  process  or  an  Acore  actor  makes  a  function  or  method  call,  it  passes  a  continuation 
to  which  results  should  be  sent.  The  continuation  includes  both  a  process  and  a  slot  within 
that  process  in  which  the  result  should  be  stored. 

J-Machine  References 

[13]  and  [14]  are  good  descriptions  of  the  philosophy  of  the  J-Machine  project  and  the  early 
Concurrent  Smalltalk  language;  [15]  is  a  recent  status  report  on  the  MDP  from  the  hardware 
perspective.  [22]  describes  some  of  the  experiences  gained  from  designing  the  previous  ver¬ 
sion  of  Concurrent  Smalltalk  and  implementing  the  first-generation  Optimist  compiler.  [10] 
contains  a  nontrivial  program  written  in  an  older  dialect  of  Concurrent  Smalltalk.  [8]  and 
[9]  describe  Concurrent  Aggregates,  a  higher-level  language  than  Concurrent  Smalltalk  for 
programming  the  J-Machine.  [33]  and  [34]  describe  a  parallel  project  to  implement  dataflow 
on  the  J-Machine.  Finally,  [41]  and  [42]  analyze  the  desirability  of  supporting  the  more 
common  existing  parallel  programming  paradigms  on  the  J-Machine. 


complaint  mechanism  could  bo  built  on  top  of  Concurrent  Smalltalk  by  using  the  multiple-value  return  feature — 
one  of  the  values  could  denote  a  continuation  to  which  exceptions  should  be  routed.  Acore  uses  a  similar  im¬ 
plementation  to  handle  exceptions. 

’Nevertheless,  a  language  that  hides  futures  could  be  built  on  top  of  .\core. 
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Introduction 

A  Concurrent  Smalltalk  program  is  a  sequence  of  top-level  definitions.  Figure  2-1  shows  a 
sample  program  that  calculates  Fibonacci  numbers  using  double  recursion. 

(Defmethod  fib  Integer  () 

(if  «-  self  2) 

1 

(+  (fib  (-  self  1))  (fib  (-  self  2))))) 

Figure  2-1.  A  simple  Fibonacci  program 

This  program  calculates  Fibonacci  numbers  using  double  recursion.  Although  it  does  not  use  the  most  efficient  al¬ 
gorithm  to  calculate  Fibonacci  numbers,  it  does  illustrate  Concurrent  Smalltalk’s  implicit  concurrency. 

The  program  is  a  single  method  associated  with  the  selector  fib  and  class  integer.  The  fact 
that  the  method  takes  no  arguments  other  than  the  integer  receiver  is  indicated  by  the  empty 
list,  ( ) ,  on  the  first  line.  The  following  three  lines  contain  the  body  of  the  method.  Self  rep¬ 
resents  the  receiver  object ,  which  is  the  number  to  which  fib  was  applied.  The  if  statement 
checks  whether  that  number  is  less  than  or  equal  to  2.  If  so,  fib  returns  1.  Otherwise,  fib 
returns  the  sum  of  (fib  (-  self  l))and  (fib  (-  self  2) ),  which  are  computed  con¬ 
currently.  This  concurrent  evaluation  of  arguments  is  one  of  the  important  differences  be¬ 
tween  Concurrent  Smalltalk  and  sequential  Smalltalk. 

Fib  can  be  invoked  by  calling  it  on  an  integer  (the  receiver  object): 

(fib  30) 

Fib  would  then  calculate  and  return  the  answer  832040.  If  fib  had  any  more  arguments, 
they  would  be  included  after  the  receiver  object,  as  in: 

(fib  30  X  y  z) 

Functions 

The  Fibonacci  program  was  defined  as  a  method.  It  is  also  possible  to  define  it  as  a  function, 
as  in  Figure  2-2.  A  function  is  a  method  not  associated  with  any  class  or  selector.  Although 
in  this  example  methods  and  functions  are  equivalent,  in  other  cases,  such  as  in  iterators, 
functions  may  be  more  useful  than  methods. 

(Defun  ffib  (n) 

(if  «-  n  2) 

1 

(+  (ffib  (-  n  D)  (ffib  (-  n  2)>))> 

Figure  2-2.  A  simple  Fibonacci  program  as  a  function 

Functions  have  no  receiver  object,  so  the  parameter  n  has  to  be  specified  explicitly. 

The  syntax  for  a  method  and  a  function  call  is  the  same,  so  ffib  would  also  be  called  by: 
(ffib  30) 

The  meaning  of  applying  ffib  to  arguments  (30  in  this  case)  depends  on  whether  ffib  is  a 
selector  or  a  function.  If  ffib  were  a  selector,  a  method  lookup  would  be  done  to  determine 
the  class  of  the  first  argument  and  then  call  the  method  corresponding  to  the  selector  and 
that  class,  while  if  ffib  is  a  function,  it  is  called  directly. 


14 


Chapter  2 


Concurrent  Smalltalk 


Extracting  Methods 

A  manual  method  lookup  can  be  done  using  the  method  primitive.  Method  takes  two  pa¬ 
rameters,  a  selector  and  a  class,  and  returns  a  function  which  performs  the  same  action  as 
the  method.  For  example,  the  method  shown  in  Figure  2-1  can  be  extracted  using 
(method  fib  integer) 

The  result  behaves  just  like  the  £f  ib  function  in  Figure  2-2.  It  can  be  called  using 
( (method  fib  integer)  30) 

A  method  extracted  in  this  way  does  not  have  to  be  a  direct  method  of  the  class;  it  can  be  an 
inherited  method. 

Classes 

A  Concurrent  Smalltalk  class  is  a  type;  the  two  words  are  used  interchangeably  in  the  lan¬ 
guage  definition^  A  few  built-in  classes  are  predefined;  these  include  symbols,  booleans,  in¬ 
tegers,  floating  point  numbers,  characters,  functions,  and  other  classes.  A  complete  list  is 
given  in  table  A-2.  All  classes  are  subclasses  of  the  class  object. 

The  defclass  primitive  can  be  used  to  add  user-defined  classes.  A  class  definition  consists 
of  a  list  of  superclasses  and  zero  or  more  new  instance  variables.  Each  instance  object  of  that 
class  contains  those  instance  variables.  The  user  may  also  define  a  number  of  methods  for 
that  class.  A  simple  class  that  implements  Lisp-like  lists  is  shown  in  Figure  2-3. 

(Defclass  pair  (object)  car  cdr) 

; (Defmethod  car  pair  ()  car) 

; (Defmethod  cdr  pair  ()  cdr) 

; (Defmethod  get-car  pair  ()  car) 

;  (Defmethod  get-cdr  pair  ()  cdr) 

; (Defmethod  put-car  pair  (value) :pair  (set  car  value)  self) 

; (Defmethod  put-cdr  pair  (value) :pair  (set  cdr  value)  self) 

(Defun  cons  (first  second) :pair 

(put-car-cdr  (new  pair)  first  second) ) 

(Defmethod  put-car-cdr  pair  (first  second) :pair 
(cset  car  first) 

(cset  cdr  second) 
self) 

Figure  2-3.  The  pair  class 

The  six  methods  that  are  commented  out  by  semicolons  are  defined  automatically  by  defclass  (in  addition  to  a 
few  others  described  in  Section  A. 4).  car  and  get-car  do  the  same  thing:  both  are  defined  because  car  is 
nwre  convenient,  but  it  cannot  be  us^  in  the  body  of  a  method  of  class  pair  because  static  scoping  shadows  the 
method  car  by  the  instance  variable  car. 

The  :  pal  r  constructs  define  the  result  types  of  the  methods.  They  are  unnecessary,  but  they  do  improve  effi¬ 
ciency  and  allow  rudimentary  type  checking. 

The  class  pair  is  defined  on  the  first  line  of  Figure  2-3.  The  defclass  primitive  specifies 
the  class  name  (pair) ,  the  superclasses  ( (object) ),  and  the  instance  variables  (car  and 
cdr). 

Whenever  a  class  c  is  defined,  a  class  predicate  and  reader  and  writer  methods  are  defined 
automatically,  as  well  other,  less-used  methods  described  in  Section  A.4.  The  class  predicate 
is  a  function  named  c?  that  accepts  one  argument  a  and  returns  true  if  a  is  a  member  of 
class  C  (or  one  of  its  subclasses)  and  false  otherwise.  Also,  for  each  instance  variable  x  of  C, 


'Nonetheless,  the  words  type  andcia.ss  have  slightly  dilTcrcnt  meanings  in  the  discussion  of  the  compiler  in  Chapter 

3. 
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the  methods  x,  get-x,  and  put-x  are  deflned.  The  first  two  methods  take  an  instance  object 
O  as  an  argument  and  return  the  value  of  x  ino,  while  put-x  takes  two  arguments,  an  in¬ 
stance  object  o  and  a  new  value  v  of  x,  and  assigns  v  to  x  in  o.  The  methods  x  and  get-x  are 
known  as  reader  methods,  while  put-x  is  called  a  writer  method.  The  writer  methods  return 
0,  the  object  to  which  the  value  is  written. 

After  a  class  is  defined,  additional  methods  may  be  defined  for  it.  In  the  above  example,  a 
method  put-car-cdr  is  defined  for  the  class  pair.  Put-car-cdr  sets  the  value  of  a  pair’s 
car  and  edr  variables  and  returns  the  pair.  Inside  a  method,  the  receiver’s  instance  vari¬ 
ables  can  be  accessed  by  their  names. 

Overriding  Methods 

Consider  a  class  C2  which  is  a  subclass  of  c1.  When  a  class  c2  defines  a  method  m2  with  the 
same  selector  s  as  a  method  m1  of  c1,  the  class  c2  is  said  to  be  overriding  the  method  ml. 
When  selector  s  is  applied  to  an  object  of  class  c2  or  one  of  its  descendants,  method  m2  will  be 
used  instead  of  ml. 

Nevertheless,  sometimes  it  is  desirable  to  call  ml  on  an  object  of  class  C2.  For  example, 
method  m2  might  want  to  call  the  method  it  is  overriding.  An  overridden  method  ml  can  be 
called  by  performing  a  manual  method  lookup  using  the  form  (method  s  c1) .  The  resulting 
method  can  be  called  normally. 

Type  Restriction 

The  type  of  an  overriding  method  must  be  a  subtype  of  the  type  of  the  overridden  method. 
For  instance,  in  the  above  example  the  type  of  m2  must  be  a  subtype  of  the  type  of  ml .  This 
means  that  both  methods  must  have  the  same  number  of  arguments,  the  types  of  the  argu¬ 
ments  of  the  overriding  method  must  be  supertypes  (superclasses)  of  the  types  of  the  argu¬ 
ments  of  the  overridden  method,  and  the  result  type  of  the  overriding  method  must  be  a  sub- 
type  (subclass)  of  the  result  type  of  the  overridden  method.  If  any  argument  of  the  overrid¬ 
den  method  is  declared  inline  or  using  any  other  declaration,  either  explicitly  or  by  default, 
the  corresponding  argument  of  the  overriding  method  must  have  the  same  type  and  declara¬ 
tions.  The  results  of  violating  the  above  rules  are  undefined.  The  compiler  may  issue  errors 
if  the  above  rule  is  violated,  but  it  is  not  guaranteed  to  do  so. 

The  above  restrictions  apply  only  to  methods  being  overridden.  There  are  no  restrictions  on 
methods  with  the  same  name  declared  for  disjoint  classes  (i.e.  classes  which  are  not  sub¬ 
classes  of  each  other). 

The  Class  Object 

Methods  of  class  object  are  very  similar  to  functions.  There  are  two  main  differences  be¬ 
tween  functions  and  methods  of  class  object: 

•  A  method  of  class  object  can  be  overridden  by  a  method  of  a  more  specific  class.  For  ex¬ 
ample,  if  cons  in  Figure  2-3  is  defined  as  a  function,  no  other  function  or  method  may  be 
called  cons.  On  the  other  hand,  if  it  is  defined  as  a  method  of  class  object,  it  may  be  over¬ 
ridden  by  a  method  cons  defined  for  integers.  However,  a  method  may  not  be  overridden  by 
a  function. 

•  A  function  that  takes  no  parameters  can  be  defined,  while  a  method  must  always  take  at 
least  one  parameter — the  instance  object. 

In  the  interest  of  code  maintenance  and  readability,  it  is  recommended  that  functions  be  used 
in  cases  when  overriding  makes  no  sense;  parameter  functions  to  iterators  fall  into  this  cate¬ 
gory.  On  the  other  hand,  if  overriding  a  function  might  be  desirable,  that  function  should  be 
defined  as  a  method  of  type  object.  It  is  not  clear  whether  overriding  cons  (Figure  2-3) 
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would  be  useful,  so  it  might  be  defined  either  as  a  function  or  a  method,  depending  on  one’s 
taste. 

Local  Variables 

A  method  or  a  function  can  declare  local  variables  using  the  clet  or  let  statements  or  their 
derivatives.  For  example,  the  function  fib  from  Figure  2-1  could  be  rewritten  using  two  lo¬ 
cal  variables  as  in  Figure  2-4. 


(Defmethod  Integer  Ifib  () 

(if  (<-  self  2) 

1 

(clet 

(  (a  (Ifib  (-  self  1) )  ) 

(b  (Ifib  (-  self  2)))) 

(+  a  b))) 

Figure  2-4.  Fibonacci  program  with  local  variables 

The  above  program  is  equivalent  to  the  one  in  Figure  2-1  and  actually  compiles  into  the  same  code. 

Local  variables  declared  with  a  clet  or  a  let  statement  have  a  scope  which  is  the  body  of 
the  clet  or  let  si^atement  (except  for  the  bindings  themselves).  CLet  and  let  statements 
can  be  nested.  Local  variables  can  be  altered  using  a  cset  or  a  set  statement;  the  difference 
between  the  two  will  be  explained  in  the  Concurrency  section  below. 

Types 

The  types  (i.e.  classes)  of  various  values  can  be  declared  explicitly.  Such  declarations  serve 
three  purposes; 

•  Types  allow  the  compiler  to  generate  faster  code  by  allowing  it  to  perform  operations 
such  as  method  lookup  at  compile  time. 

•  The  compiler  can  perform  type  checking  to  find  simple  errors  such  as  passing  a  value  of 
one  type  to  a  function  that  is  expecting  a  value  of  a  different  type. 

•  Declaring  types  of  function  parameters  and  results  serves  to  document  the  code. 

For  the  purposes  of  type  inclusion,  a  type  is  its  own  supertype  and  subtype. 

Due  to  the  common  use  of  generic  types,  the  compiler’s  type  checking  is  necessarily  limited. 
In  particular,  when  an  expression  of  type  t1  is  assigned  to  a  variable  of  type  12  or  passed  as  a 
parameter  to  a  function  that  expects  type  12,  the  compiler  usually  will  give  an  error  or  a 
warning  if  11  is  not  12, 11  is  not  a  superclass  of  12,  and  12  is  not  a  superclass  of  11 .  This  does 
not  mean,  however,  that  the  semantics  of  function  parameter  and  return  type  declarations 
are  any  different  from  their  standard  interpretations — when  a  function  parameter  is  declared 
type  1,  every  value  passed  as  that  parameter  must  be  a  member  of  type  1,  and  when  a  func¬ 
tion  result  is  declared  type  1,  the  function  must  return  a  value  that  is  a  member  of  type  1  as 
that  result — the  only  difficulty  is  that  the  compiler  is  not  able  to  do  full  type  checking,  so  it 
usually  follows  the  rules  outlined  above. 

For  example,  integer  and  boolean  are  both  subclasses  of  the  object  and  magnitude 
classes  (see  Figure  A-2),  but  they  are  otherwise  unrelated  to  each  other.  Thus  an  integer 
can  be  passed  to  a  function  that  expects  an  object,  an  object  can  be  passed  to  a  function 
that  expects  an  integer,  but  a  boolean  cannot  be  passed  to  a  function  that  expects  an  in¬ 
teger.  The  second  possibility,  passing  a  more  general  type  to  a  function  that  expects  a  less 
general  one,  is  included  to  handle  the  common  case  of  extracting  values  from  general  storage 
class.  One  could,  for  example,  keep  a  pair  of  integers  and  desire  to  add  the  pair’s  car  and 
cdr  together.  Since  a  pair  is  a  generic  data  structure,  it  can  contain  values  of  type  object; 
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a  compiler  has  no  simple  way  of  knowing  at  compile  time  that  the  pair  will  contain  inte¬ 
gers,  so  the  best  it  can  deduce  is  that  the  pair's  car  and  cdr  are  objects. 

Types  can  be  declared  as  follows: 

•  To  specify  the  type  of  a  local  or  an  instance  variable,  follow  the  variable  name  with  a 
colon  and  its  type.  Several  locals  can  be  declared  using  the  same  type  by  separating  their 
names  with  commas. 

•  To  specify  the  type  of  a  function  or  method  formal,  follow  the  formal  name  with  a  colon 
and  its  type.  Several  formals  can  be  declared  using  the  same  type  by  separating  their  names 
with  commas. 

•  To  specify  the  result  type  of  a  function  or  method,  follow  the  list  of  formals  with  a  colon 
and  the  result  type^. 

•  A  type  of  an  intermediate  result  can  be  specified  using  a  type-assertion  statement^. 

The  three  kinds  of  declarations  are  illustrated  in  Figure  2-5,  yet  another  copy  of  the  Fi¬ 
bonacci  program.  All  untyped  variables,  parameters,  and  functions  and  methods  are  typed 
object  by  default. 

(Defun  tfib  (n: integer) : integer 
(if  «-  n  2) 

1 

(clet 

((a;integer  (tfib  (-  self  1))) 

(b;integer  (tfib  (-  self  2)))) 

(+  a  b))) 

Figure  2-5.  Fibonacci  program  with  types 

There  are  three  type  declarations  here.  In  order,  they  are  a  declaration  of  the  parameter  type  of  n,  a  declaration  of 
t  fib's  result  type,  and  declarations  of  the  types  of  the  local  variables  a  and  b. 


Concurrency 

Concurrency  is  expressed  in  Concurrent  Smalltalk  in  several  ways: 

•  Concurrent  argument  evaluation.  In 

(+  (big-computation  3)  (time-sink  738)) 
the  expressions  big-computation  and  time-sinlc  can  be  evaluated  in  parallel. 

•  Expressions  in  concurrently  statements  may  be  evaluated  concurrently.  The  expres¬ 
sions  in  parallel  statements  are  always  evaluated  concurrently. 

•  The  variable  bindings  in  clet  and  let  statements  can  also  be  evaluated  concurrently. 
For  example,  the  expressions  big-computation  and  time-sinlc  can  be  evaluated  concur¬ 
rently  in 

(cset  a  (big-computation  3) ) 

(cset  b  (time-sin)c  738)  ) 

(+  a  b) 
as  well  as  in 

(let  ((a  (big-computation  3)) 

(b  (time-sink  738)) 

(+  a  b)  ) 

•  The  computations  in  assignments  using  cset  and  in  function  calls  whose  result  values 
are  unused  can  be  done  concurrently  with  neighboring  statements. 


1 


Sec  also  return  values  in  section  A. 5  for  a  description  of  specifying  types  of  multiple  results. 
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•  The  computations  done  for  futures  are  always  evaluated  in  parallel. 

The  action  of  a  cset  can  be  thought  of  as  storing  a  promise  Gcnown  as  a  cfuture)  to  calculate 
the  value  of  a  variable.  For  example,  after 
(cset  a  (big-computation  3) ) 

is  executed,  a  will  contain  either  the  value  of  (big-computation  3)  or  a  cfuture  promising 
to  deliver  that  value  when  it  is  needed.  If  a  contains  a  cfuture,  (big-computation  3)  is 
evaluated  in  parallel  by  a  different  task.  At  the  same  time,  execution  of  the  method  can  pro¬ 
ceed  and  the  method  can  perform  another  time-consuming  task.  It  will  not  have  to  wait  for 
(big-computation  3)  to  complete  until  the  value  of  a  is  needed. 

Sometimes  it  is  desirable  to  explicitly  wait  until  the  value  of  an  expression  is  available  before 
continuing.  This  is  called  either  touching  or  forcing  the  expression.  Touching  or  forcing  an 
expression  that  evaluates  to  a  normal  value  does  nothing.  Touching  or  forcing  an  expression 
that  evaluates  to  a  cfuture  causes  evaluation  to  wait  until  the  value  of  the  cfuture  is  avail¬ 
able.  Finally,  touching  an  expression  that  evaluates  to  a  future  does  nothing,  while  forcing  it 
causes  evaluation  to  wait  until  the  value  of  the  future  is  available.  The  resulting  value  is 
then  touched  or  forced  again  until  the  touch  or  force  operation  does  not  change  it. 

An  expression  can  be  touched  using  the  touch  statement  and  forced  using  the  force  state¬ 
ment.  Since  built-in  methods  and  functions  usually  touch  or  force  their  arguments,  touching 
and  forcing  are  rarely  done  explicitly. 

The  reference  manual  in  Appendix  A  defines  more  precise  semantics  for  what  expressions 
may  or  may  not  be  evaluated  in  parallel. 

Locks 

(defclass  resource  (object) 

1 ; lock 

...  other  fields) 

(def method  init  resource  () 

(cset  1  (oew-simple-lock) )  /Creates  an  initially  available  lock 
...  other  initialization  code) 

(defun  new-resource  () 

(init  (new  resource) ) ) 

(defmethod  access  resource  (parameters 
(acquire  1) 

...  code  to  perform  the  access  using  parameters  ... 

(release  1) ) 

(defmethod  access2  resource  (parameters 
(with-locks  (1) 

...  code  to  perform  the  access  using  parameters  ...) ) 

Figure  2-6.  Lock  Example 

This  example  defines  a  class  resource  that  contains  a  lock.  Every  call  to  access  acquires  the  lock  when  rt 
starts  and  releases  it  when  done,  so  the  code  in  the  middle  of  the  access  method  cannot  be  interrupted  by  an¬ 
other  access  method.  The  with-locks  macro  is  a  convenient  shorthand  for  acquiring  and  releasing  locks;  the 
access  method  could  have  been  rewritten  as  access2. 

Locks  are  used  to  synchronize  computation  by  Concurrent  Smalltalk  programs.  Locks  are 
especially  useful  around  critical  sections  of  code  where  only  one  process  may  access  a  re¬ 
source;  a  process  that  wants  the  resource  acquires  a  lock  before  accessing  the  resource  and 
releases  it  when  it  is  done.  Two  variants  of  locks  are  provided.  Simple-locks  are  fast  locks 
which,  however,  perform  poorly  when  many  processes  are  waiting  for  a  resource;  simple- 


^See  section  A.6. 
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locks  should  be  used  in  situations  in  which  the  probability  of  contention  for  a  resource  is 
small.  Queueing-locks  are  slower  locks  designed  to  handle  a  large  amount  of  contention. 

As  an  example  of  the  use  of  locks,  suppose  one  wants  to  restrict  the  use  of  a  resource  so  that 
only  one  process  can  access  it  at  a  time.  To  accomplish  this  exclusion,  a  lock  can  be  associ¬ 
ated  with  the  resource,  in  which  case  every  process  should  acquire  the  lock  before  using  the 
resource  and  release  it  when  done.  Figure  2-6  shows  sample  code  used  to  access  the  resource. 

Distributed  Objects 

(defclass  distarray  (distobj) 
value) 

(defun  new-distarray  (size: integer) 

(new  distarray  size) 

(def method  get  distarray  (index: integer) 

(get-value  (co  group  index) ) ) 

(defmethod  put  distarray  (index: integer  new-value) 

(set  (get-value  (co  group  index))  new-value)) 

(defmethod  size  distarray  () 

(logical-limit  self) ) 

Figure  2-7.  Distributed  Object  Example 

This  example  defines  a  class  distarray  used  for  distributed  arrays.  The  get  method  returns  the  element  at 
position  in^x  in  the  array:  since  each  constituent  contains  only  one  element  of  the  array,  the  get  method  returns 
the  value  in  the  constituent  specified  by  the  given  index.  Similarly,  the  put  method  routes  the  message  to  the 
constituent  specified  by  index,  where  it  stores  new-value.  The  size  method  simply  returns  the  array's  size. 

Whereas  standard  objects  serialize  messages  sent  to  them',  distributed  objects  can  accept 
and  process  many  messages  at  a  time.  A  distributed  object  is  comprised  of  an  array  of  con¬ 
stituent  objects  and  a  common,  group  name.  When  a  message  is  sent  to  the  group  name,  the 
operating  system  routes  it  to  a  constituent  of  its  choosing.  The  constituent  can  then  process 
the  message  or  send  it  to  another  constituent;  constituents  know  how  to  address  each  other. 
The  CO  primitive  is  used  to  find  a  particular  constituent  of  a  distributed  object,  while  the 
group  instance  variable  can  be  read  to  determine  the  group  name  of  a  distributed  object 
given  one  of  its  constituents. 

For  example,  a  large  array  might  be  implemented  as  a  distributed  object.  When  a  get  mes¬ 
sage  is  sent  to  the  array  to  read  a  value  of  a  particular  element,  the  message  is  routed  to  one 
of  the  constituents.  That  constituent  examines  the  given  index  and  forwards  the  message  to 
the  constituent  containing  the  element,  which  reads  and  returns  the  value. 

Figure  2-7  shows  a  simple  example  of  the  use  of  distributed  objects  to  create  a  distributed  ar¬ 
ray.  Each  constituent  contains  only  one  element  of  the  array  to  keep  this  example  short;  a 
better  implementation  would  use  a  simple-array  at  each  constituent  to  reduce  the  number 
of  constituents  needed. 

The  advantages  of  using  a  distarray  class  like  the  one  in  Figure  2-7  is  that  many  accesses 
can  be  made  to  the  array  simultaneously;  they  do  not  have  to  pass  through  a  common  bottle¬ 
neck  to  access  the  array.  In  addition,  as  will  be  clarified  in  Section  3.3,  the  get  and  put 
methods  do  not  access  any  instance  variables  of  distarray  themselves,  so  they  could  be  in- 
lined  wherever  they  are  called^;  thus,  reading  or  writing  the  distarray  in  Figure  2-7  could 


'Except  for  a  few  special  cases  such  as  immutable  objects  and  messages  which  do  not  need  to  access  an  object’s  data 
to  execute,  only  one  message  may  be  processing  on  a  standard  object  at  a  time. 

*The  compiler’s  handling  of  group  would  have  to  change  a  little  to  permit  this  optimization;  the  compiler  currently 
treats  group  solely  as  an  instance  variable,  but  there  is  no  intrinsic  reason  why  the  compiler  could  not  provide  a  by¬ 
pass  path  that  checks  whether  a  method  was  called  on  a  group  ID  (as  opposed  to  a  constituent  ID)  and  just  uses  the 
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involve  only  two  message  sends,  which  is  no  less  efficient  than  reading  or  writing  a  single- 

array. 

Macros 

Concurrent  Smalltalk  provides  a  macro  facility  which  can  be  used  to  extend  the  language.  A 
macro  consists  of  a  pattern  and  a  replacement.  The  pattern  can  contain  variables  or  key¬ 
words.  If  it  matches  with  an  expression,  that  expression  is  replaced  by  the  replacement, 
which  can  be  either  another  pattern  or  a  Common  Lisp  function^  Much  of  the  language  it¬ 
self  has  been  implemented  in  terms  of  macros.  Figure  2-8  contains  a  sample  macro  which  de¬ 
fines  a  when  form  that  is  the  equivalent  of  a  Common  Lisp  when. 

(defmacro  (when  ?teat  .  ?body) 

(if  ?test 

(begin  .  ?]3ody) ) 

Figure  2-8.  When  macro 

The  when  fc>rni  defined  by  this  macro  takes  a  test  and  a  number  of  statements  comprising  the  body.  If  the  test  is 
true,  the  statements  are  executed  one  after  arx>ther,  as  in  begin.  If  the  test  is  false,  when  returns  nil.  This 
macro  takes  advantage  of  the  fact  that  if  returns  nil  if  there  is  no  else-cteusa  and  the  condition  is  false.  The 
Lisp  dot  notation  is  used  to  indicate  that  the  body  forms  the  rest  of  the  given  list 


group  ID  if  it  was  provided  instead  of  always  using  the  group  instance  variable.  When  this  optimization  is  imple¬ 
mented.  distributed  arrays  such  as  the  one  above  will  be  as  eflicient  as  simple  arrays. 

^Concurrent  Smalltalk  functions  may  be  added  as  replacements  later,  when  the  entire  compiler  and  development 
system  is  rewritten  in  Concurrent  Smalltalk. 
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Optimist  II  is  an  optimizing  compiler  for  the  Concurrent  Smalltalk  langxiage  described  in 
Appendix  A.  The  compiler  generates  assembly  language  code  for  the  Message-Driven  Pro¬ 
cessor. 

Optimist  II  is  based  on  the  Optimist  compiler  described  in  [21].  Optimist  included  many 
standard  optimizations  such  as  register  variable  assignment,  dataflow  analysis,  copy  propa¬ 
gation,  and  dead  code  elimination  [3]  [43]  that  are  used  in  compilers  for  conventional  proces¬ 
sors.  In  addition.  Optimist  included  fork  and  join  mergers  that  try  to  merge  similar  (not  nec¬ 
essarily  identical)  statements  on  both  sides  of  conditionals,  a  powerful  move  eliminator,  and 
numerous  code  generator  optimizations  to  accommodate  various  idiosyncrasies  of  the  MDP. 

Optimist  II  is  a  substantial  improvement  over  the  Optimist  compiler.  While  Optimist  sup¬ 
ported  only  a  small  subset  of  an  early  Concurrent  Smalltalk  language.  Optimist  II  imple¬ 
ments  almost  the  entire  new  Concurrent  Smalltalk  language.  Some  language  features  sup¬ 
ported  by  Optimist  II  that  were  not  present  in  the  original  Optimist  include: 

•  Method  lookup  (Optimist  could  compile  method  code  but  could  not  associate  a  method 
with  a  selector) 

•  Global  variables 

•  Class  and  variable  declarations 

•  Macros 

•  Lambdas  and  closures 

•  Multiple  inheritance  of  classes 

•  Distributed  objects 

•  Multiple  return  values 

•  Nonlocal  exits 

•  Functions 

•  Methods  referencing  more  than  one  object  at  a  time 

•  Synchronization  primitives 

•  Arrays 

•  Methods  overriding  primitive  selectors  such  as  + 

•  Compile-time  evaluation  of  expressions 

Furthermore,  Optimist  II  contains  an  interactive  language  environment,  including  a  Concur¬ 
rent  Smalltalk  interpreter  and  facilities  to  view  code  in  various  stages  of  compilation.  Opti¬ 
mist  II  gives  helpful  warnings  and  errors  when  it  encounters  questionable  language  con¬ 
structs.  It  also  includes  entire  new  categories  of  optimization,  including  type  inference  and 
global  program  optimizations.  Finally,  Optimist  Il’s  code  generator  has  been  updated  to  con¬ 
form  to  and  optimize  for  MDP  Architecture  version  IIB  [16]'  instead  of  Optimist’s  Architec¬ 
ture  10  [23]. 


'This  reference  is  to  MDP  Architecture  version  11.  Version  IIB  has  not  been  published  yet. 
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The  only  language  features  listed  in  Appendix  A  missing  from  Optimist  II  are  full  futures 
and  I/O  facilities.  It  is  expected  that  they  will  be  added  later,  when  the  operating  system  is 
updated  to  support  them.  In  addition,  some  optional  features  of  the  language  such  as  inline 
objects  and  first-class  continuations  have  not  been  implemented,  although  facilities  have 
been  provided  that  will  simplify  their  implementation  in  the  future. 

Structure 

Figure  3-1  shows  the  overall  structure  of  the  compiler.  Concurrent  Smalltalk  code  is  read 
and  parsed  by  the  reader  and  parser,  transformed  by  the  preoptimizer,  and  saved  in  the 
global  environment.  It  can  be  either  interpreted  using  the  global  environment  or  optimized 
further  by  the  optimizer  and  then  compiled  into  MDP  assembly  code  by  the  compiler  and 
assembler.  The  treewalker  controls  the  compilation  process  and  prevents  unused  modules 
and  objects  from  being  compiled  and  assembled. 

Reading  Guide 

The  Data  Structures  section  introduces  the  common  data  structures  used  in  the  Optimist  II 
compiler.  A  few  data  structures  such  as  digraphs  and  hcode  appear  throughout  the  compiler, 
and  familiarity  with  them  is  assumed  in  the  later  sections. 

The  next  three  sections  discuss  the  three  main  components  of  the  compiler  environment:  The 
Initial  Phase  includes  facilities  to  read  Concurrent  Smalltalk  expressions  and  compile  them 
into  hcode  (an  intermediate  code  format),  interpret  that  hcode,  and  maintain  the  global  Con¬ 
current  Smalltalk  environment.  This  phase  executes  until  the  user  requests  a  compilation  of 
the  program  to  MDP  assembly  code,  at  which  time  the  other  two  phases  are  invoked.  Most  of 
the  optimizations  in  Optimist  II  are  done  in  the  Optimization  phase,  although  a  few  appro¬ 
priate  optimizations  are  scattered  in  the  other  phases.  The  Code  Generation  phase  com¬ 
piles  the  optimized  hcode  into  MDP  assembly  language  and  outputs  that  assembly  language, 
together  with  immediate  objects,  class  descriptors,  and  method  tables,  after  performing  a  few 
final  optimizations.  The  output  of  the  Code  Generation  phase  can  be  read  directly  into 
MDPSim.  The  code  generator  and  MDPSim  share  the  task  of  linking  programs.  Finally,  the 
Summary  section  summarizes  the  important  ideas  in  the  compiler. 

Chapter  5,  Sample  Program,  shows  the  progress  of  a  sample  program  through  various 
phases  of  the  compiler,  and  it  may  be  helpful  to  illustrate  some  of  the  optimizations. 
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MDPSim  Executable  File 
Figure  3-1.  Optimist  II  Organization 
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3.1,  Data  Structures 


Utilities 

Optimist  II  uses  a  number  of  supporting  data  structures  throughout  the  compilation  process. 
These  include  abstractions  such  as  environments,  queues,  ordered  sets,  bit  sets,  and  exten¬ 
sions  to  CLOS.  The  supporting  data  structures  are  defined  in  the  System  Utilities,  Utilities, 
and  Digraph  files*. 

An  environment  associates  keys  with  values.  Environments  can  be  atomic,  linked  to  each 
other,  and  either  simple  or  based  on  hash  tables.  Atomic  environments  allow  a  series  of 
changes  to  be  cancelled,  which  is  a  useful  operation  if  a  syntax  error  is  found  in  the  input 
Concurrent  Smalltalk  expression.  See  the  System  Utilities  file  for  more  information  about  the 
internal  Optimist  II  environment  formats. 

The  implementation  of  digraphs  (directed  graphs)  is  discussed  in  [21].  The  digraphs  in  Op¬ 
timist  II  extend  that  implementation  by  taking  advantage  of  CLOS’s  class  inheritance  mech¬ 
anism  and  by  automatically  marking  a  digraph  altered  when  any  change  to  it  is  made,  elimi¬ 
nating  some  hard-to-find  consistency  errors.  For  example,  using  the  dfs  function  to  ask  for  a 
listing  of  the  nodes  of  a  digraph  will  always  yield  an  up-to-date  list.  Furthermore,  an  Opti¬ 
mist  II  digraph  has  a  root  dinode  that  is  attached  to  both  the  digraph’s  starting  nodes  and 
the  ending  nodes,  allowing  easy  identification  of  the  digraph’s  exit  points.  Including  the  root 
node  generalizes  some  algorithms;  for  example,  the  join  merger  can  now  join  statements  at 
the  end  of  the  digraph. 

'The  traversal  returned  by  dfs  is  not  quite  a  depth-first  search — the  search  order  is  depth-first 
modified  to  avoid  listing  a  node  ahead  of  its  predecessors  whenever  possible.  If  the  graph  is 
acyclic,  no  node  (except  the  root)  is  listed  before  its  predecessors.  The  digraph  dataflow  prob¬ 
lem  solver  [21]  [3]  has  been  updated  to  detect  this  condition  and  solve  a  dataflow  problem  on 
a  digraph  in  one  pass  if  the  digraph  is  acyclic;  otherwise,  the  dataflow  solver  makes  two  or 
more  passes  until  no  node  changes.  Moreover,  dfs  automatically  detects  and  removes  dead 
dinodes  from  a  digraph;  dead  dinodes  are  dinodes  which  cannot  be  reached  by  following  the 
edges  in  the  digraph  in  the  forward  direction  starting  at  the  root,  but  which  can  be  reached 
from  the  root  by  following  edges  in  the  undirected  digraph. 

The  other  structures  based  on  digraphs  such  as  modules  are  similar  to  those  in  Optimist. 
See  the  Digraph  file  for  more  details,  including  the  dataflow  problem  solver  and  a  directed 
graph  mapper  utility. 

Hcodes 

Hcode  is  the  primary  intermediate  code  format  of  the  Optimist  II  compiler.  It  is  loosely 
based  on  I-code  found  in  Optimist.  Hcodes^  are  represented  by  instance  objects  of  CLOS 
classes,  and  there  is  no  uniform  syntax  for  reading  and  writing  programs  in  hcode  form, 
although  the  show  utility  prints  hcodes  fairly  well.  In  addition,  the  usage  of  hcodes  is  not 
uniform  throughout  the  compiler.  The  sets  of  hcodes  allowed  in  different  stages  of  the  com¬ 
piler  differ — some  hcodes  are  used  early  and  then  banned,  while  others  are  introduced  just 
before  assembly  code  is  generated.  The  number  of  hcodes  used  in  the  compiler  is  small  and 
fixed — there  are  only  thirteen  hcodes,  and  nine  of  them  are  limited  to  certain  phases  the 
compiler.  Since  there  are  few  hcodes,  most  operations  can  be  expressed  in  only  one  way  in 
hcode,  and  the  optimization  algorithms  have  to  handle  only  a  few  cases  instead  of  many  syn¬ 
onymous  I-codes,  as  used  in  Optimist. 


*Sce  Appendix  E  for  information  on  gelling  copies  of  the  Tiles. 
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Table  3-1.  Hcodes 


HCode 

Arquments 

Usage* 

Action 

Directive 

Directive 

Directive-arqs 

1,  Pre 

Evaluate  top-level  directive  such  as  add-method 
on  the  directive-arqs. 

Application 

Targets 

Fund 

Aras 

1.  Pre,  0. 
Post,  C 

Apply  fund  to  args  and  put  the  result  values  in 
the  targets. 

Assert-Type 

1.  Pre.  0. 
Post 

Assert  that  the  argument  value’s  type  is  a  sub¬ 
type  of  type. 

Move 

Target 

Source 

1.  Pre.  0. 
Post.  C 

Move  value  from  source  to  target. 

Make-Closure 

The-Lambda 

Sources 

1.  Pre,  0. 
Post 

Make  a  closure  out  of  the-lambda  using  sources 
as  the  values  of  the  display  arguments. 

Nconcurrently 

Threads 

fJESKM 

Execute  threads  concurrently. 

If 

Condition 

Arqument 

1,  Pre.  0, 
Post,  C 

Branch  if  argument  satisfies  condition.  Table  3-2 
lists  the  allowed  conditions. 

Touch 

Argument 

1,  Pre,  0. 
Post,  C 

Touch  the  argument 

Force 

Argument 

1,  Pre,  0, 
Post 

Force  the  argument 

Make-Future 

Target 

Argument 

Lazy 

1,  Pre.  0. 
Post 

Make  a  future  which  will  evaluate  the  lambda 
passed  as  an  argument.  Store  the  future  in  tar- 
qet.  The  future  is  lazy  if  lazy  is  true. 

l.C 

Commence  function  or  method  execution. 

C 

Terminate  function  or  method  execution. 

Arqument 

c 

Temporarily  dereference  an  instance  object. 

*The  Usage  column  specifies  the  stages  of  the  compiler  in  which  the  hcode  is  valid.  The  stages  are: 

I  Hcode  before  initial  transformations. 

Pre  Pro-optimized  hcode.  This  hcode  is  stored  in  the  globaJ  environment. 

O  Hcode  during  most  of  the  main  optimization  phase. 

Post  Hcode  during  the  MDP-specific  post-optimization  phase. 

C  Hcode  just  before  it  is  compiled  into  MDP  assembly  language. 

Table  3-1  lists  the  hcodes.  Most  hcodes  contain  fields  such  as  arguments  and  targets.  An  ar¬ 
gument  field  can  contain  any  rvalue^,  while  a  target  field  can  contain  any  lvalue.  Also,  a  type 
field  can  contain  any  type,  while  a  class  field  requires  aclass.  The  formats  of  those  fields  are 
listed  in  Table  3-3. 

There  is  no  hcode  that  returns  a  value  from  a  function  or  a  method.  Instead,  a  special  lvalue 
is  used  to  represent  a  continuation  to  the  caller.  A  value  is  returned  by  storing  it  using  a  ref¬ 
erence  to  the  continuation  as  a  target.  Thus,  a  move  hcode  with  a  reference  to  a  continuation 
as  a  target  is  really  a  return  statement,  while  an  application  hcode  with  a  reference  to  a  con¬ 
tinuation  as  a  target  is  a  tail-forwarded  application.  More  complicated  combinations  are  also 
permitted — an  application  hcode  that  returns  two  values  can  forward  one  to  a  continuation 
and  store  the  other  in  a  local  variable,  or  continuations  to  several  different  callers  within 
whose  static  scopes  a  function  resides  could  be  used.  The  benefits  of  not  including  a  return 
hcode  are  a  more  orthogonal  set  of  hcodes  and  a  simplification  in  the  tail  forwarder,  which 
now  becomes  a  somewhat  specialized  move  eliminator. 

Every  hcode  has  exactly  one  successor  in  the  digraph  except  the  if  hcode,  which  has  two,  cor¬ 
responding  to  evaluating  the  conditional  as  true  or  false.  The  nconcurrently  hcode  has  only 
one  successor,  but  it  also  contains  a  set  of  nested  digraphs,  which  may  be  evaluated  concur- 


^Sometimcs  the  word  statement  will  also  be  used  to  refer  to  an  hcode. 
^Rvalues  are  defined  below. 
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Table  3*2.  Conditions 

irue  Branch  if  the  argument  is  true.  The  argument  must  be  a  boolean, 

iaise  Branch  if  the  argument  is  false.  The  argument  must  be  a  boolean, 

mil  Branch  if  the  argument  is  eq  to  nil.  The  argument  can  be  any  object. 

:non-nil  Branch  if  the  argument  is  not  eq  to  nil.  The  argument  can  be  any  object. 

;2ero  Branch  if  the  argument  is  equal  to  0  using  the  =  predicate. 

:non-2ero  Branch  if  the  argument  is  not  equal  to  0  using  the  =  predicate. 


rently,  sequentially,  or  interleaved  in  any  fashion.  There  is  no  restriction  on  the  number  of 
predecessors  an  hcode  can  have. 

Hcodes  are  rarely  processed  alone;  usually  hcodes  are  embedded  in  a  code-lambda  or  cst- 
lambda,  which  represent  digraphs  of  hcodes  with  header  information.  A  code-lambda  con¬ 
tains  a  digraph  of  hcode  statements  together  with  a  database  of  local  variables  used  by  those 
statements.  Each  local  variable  has  an  optional  name,  a  type,  and  some  declarations  such  as 
whether  it  can  hold  inline  objects.  Furthermore,  the  locals  in  a  code-lambda  are  consecu¬ 
tively  numbered  to  allow  the  efficient  use  of  bitmaps  to  keep  track  of  variable  data  while 
solving  dataflow  problems.  In  addition,  a  code-lambda  shares  with  csl-functions  (another  in¬ 
ternal  Optimist  II  class  that  describes  all  functions,  including  primitives)  the  interface  fields 
which  consist  of  a  list  of  parameters,  return  values,  and  display  variables  used  by  closures. 

Hcodes  are  documented  in  the  HCode  file. 

Values 

An  Optimist  II  value  is  a  representation  of  a  Concurrent  Smalltalk  object — it  can  be,  say,  an 
integer,  a  character,  a  distributed  object,  a  function,  a  class,  or  any  other  valid  Concurrent 
Smalltalk  object.  On  the  other  hand,  a  variable  or  a  parameter  is  not  a  value,  but  it  may  con¬ 
tain  a  value.  In  addition,  values  of  a  few  hidden  types  such  as  continuations  and  continua¬ 
tion  displacements  are  also  used.  Many  different  representations  are  used  for  values,  and 
these  representations  will  not  be  described  further  here;  please  refer  to  the  Types  file  for 
more  details  on  this  subject. 

An  rvalue  can  be  either  a  value  or  a  location  that  can  be  read  to  obtain  a  value.  Thus,  a  local 
or  a  global  variable  is  an  rvalue,  and  so  is  the  Concurrent  Smalltalk  integer  7.  An  instance 
variable  in  general  is  not  an  rvalue,  but  a  reference  to  an  instance  variable  in  a  particular  in¬ 
stance  object  is.  The  common  rvalue  kinds  are  listed  in  Table  3-3. 

Table  3-3.  Rvalues 

^Rvalue___^_Sgeclalizers___Nmes_______^__^^^^____^^_^^ 

Value  Any  value  is  also  an  rvalue. 

Local  Name,  scope,  etc.  A  local  variable. 

Global  Name  A  global  variable. 

Option  Name  A  Concurrent  Smalltalk  option. 

Ivar-ref  Instance  variable.  An  instance  variable  of  an  instance  object.  The 

Instance  object  instance  object  must  also  be  an  rvalue. 

An  lvalue  is  a  location  into  which  a  value  can  be  written.  Examples  of  rvalues  include  local 
variables,  references  to  instance  variables  in  instance  objects,  and  references  to  continua¬ 
tions.  A  continuation  by  itself  is  not  an  lvalue,  but  a  reference  to  one  is.  The  common  lvalue 
kinds  are  listed  in  Table  3-4. 
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Table  3-4.  Lvalues 


Local 

Global 

Continuation-ref 


SgecjallMrs^^ 

Name,  scope,  etc. 
Name 

Continuation  or 
Context  and 
Displacement 


A  local  variable. 

A  global  variable. 

A  reference  to  a  continuation  specified  either  as 
a  continuation  rvalue  or  as  a  pair  of  context  and 
displacement  rvalues  (See  Section  3.3). 


All  rvalues  are  instances  of  the  rvalue  CLOS  class,  all  lvalues  are  instances  of  the  lvalue 
CLOS  class,  and  all  values  are  instances  of  the  value  and  rvalue  CLOS  classes.  CLOS’s  mul¬ 
tiple  inheritance  is  used  to  define  objects  that  are  both  rvalues  and  lvalues  or  other  combina¬ 
tions  of  the  above. 


Types  and  Classes 


Table  3-5.  Types 

Type  Speclalizers  Notes 

Class  Any  class  is  also  a  type. 

Continuation-type  Continuation-type  A  type  based  on  the  continuation  class  that 

represents  a  continuation  that  will  return  a 
value  of  the  continuation-type  type. 

Displacement-type  Continuation-type  A  type  based  on  the  displacement  class  that 

represents  a  displacement  field  of  a  continua¬ 
tion  that  will  return  a  value  of  the  continua¬ 
tion-type  type. 

A  Concurrent  Smalltalk  class  is  a  Concurrent  Smalltalk  value  that  is  an  instance  of  the 
class  class.  Classes  are  implemented  in  Optimist  II  as  instances  of  the  cst-class  CLOS 
class.  In  addition  to  itself  being  a  value,  a  class  also  represents  a  set  of  values.  For  example, 
the  class  integer  represents  the  set  of  all  integers,  which  includes,  among  others,  the  values 
4  and  -17.  The  class  null  represents  the  singleton  set  (nil).  The  class  class  represents 
the  set  of  all  Concurrent  Smalltalk  classes,  including  itself 

In  addition  to  classes.  Optimist  II  includes  types  which  provide  finer  discrimination  than 
classes  for  describing  sets  of  values.  Types  are  listed  in  Table  3-5.  Currently  a  type  is  either 
a  class  or  a  continuation  that  returns  an  object  of  some  type.  A  type  can  be  always  projected 
to  a  class;  the  base-class  Lisp  generic  function  performs  this  conversion.  A  type  that  is  also  a 
class  projects  to  itself,  while  a  continuation  type  projects  to  the  class  continuation. 
Although  a  class  is  always  a  value,  a  type  is  not  necessarily  a  value. 

Multitypes 

When  describing  the  possible  contents  of  variables.  Optimist  II  uses  the  concept  of  a  multi- 
type.  A  multitype  is  a  list  of  zero  or  more  types;  a  value  is  a  member  of  a  multitype  (satisfies 
that  multitype)  if  it  is  a  member  of  one  of  its  types.  No  value  satisfies  a  null  multitype,  while 
every  value  satisfies  a  multitype  that  has  object  as  one  of  its  types.  Routines  are  provided 
to  calculate  unions  (least  upper  bound)  and  intersections  (greatest  lower  bound)  of  multi¬ 
types  and  simplify  representations  of  multitypes.  Since  multitypes  are  not  necessarily  closed 
under  those  operations,  the  lub  and  gib  routines  may  conservatively  enlarge  their  multitype 
results. 
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Global  Data  Structures 

Two  atomic  environments,  the  global  environment  and  the  class  environment,  contain  most 
of  the  state  of  the  Concurrent  Smalltalk  interpreter.  The  global  environment  contains  all 
Concurrent  Smalltalk  globals,  parameters,  and  constants,  while  the  class  environment  con¬ 
tains  all  known  Concurrent  Smalltalk  classes.  The  global  environment  is  linked  to  the  class 
environment,  so  the  latter  is  searched  if  an  identifier  is  not  foimd  in  the  global  environment. 

The  classes  are  themselves  heavily  linked  together.  Each  class  object  has  lists  of  its  immedi¬ 
ate  superclasses  and  subclasses  and  all  of  its  superclasses  and  subclasses,  as  well  as  a  meta¬ 
class,  a  description  of  its  instance  variables,  and  sundry  options  such  as  whether  the  class  is 
immutable.  To  allow  typed  recursive  data  structures,  an  “undefined”  class  structure  is  cre¬ 
ated  when  a  class  name  is  encountered  in  a  program  without  being  defined.  An  “undefined” 
class  can  turn  into  a  normal  class  when  the  class  is  defined;  CLOS’s  change-class  construct  is 
very  valuable  here.  A  substantial  number  of  classes  have  to  be  updated  whenever  a  new 
Concurrent  Smalltalk  class  is  defined,  but  compilation  speed  does  not  seem  to  suffer  because 
of  this.  The  heavy  linking  of  classes  made  defining  a  bootstrapping  subset  of  Concurrent 
Smalltalk  classes  challenging;  some  CLOS  objects  had  to  be  created  with  the  wrong  classes 
and  then  transformed  to  the  right  classes.  Once  the  bootstrapping  subset  of  Concurrent 
Classes  was  defined,  defining  the  remaining  classes  on  top  of  it  was  easy. 

A  method  is  associated  with  both  a  class  and  a  selector.  There  is  no  single  method  table  in 
Optimist  II;  instead,  whenever  a  method  is  added,  it  is  added  to  the  selector’s  list  of  methods 
hashed  by  class  and  the  class’s  list  of  methods  hashed  by  selector.  Thus,  a  selector  knows  all 
of  the  methods  defined  for  it,  as  does  a  class.  Methods  are  not  replicated  in  these  hash  tables 
unless  a  method  is  added  more  than  once;  instead,  the  lookup-method  function,  which  returns 
a  method  associated  with  a  class  and  a  selector,  searches  the  superclasses  when  a  method  is 
not  defined  for  a  selector  and  a  class;  an  ambiguous  selector  error  is  signalled  if  there  is  more 
than  one  superclass  and  they  are  associated  with  differing  methods. 

Current  settings  of  the  options  are  also  kept  in  a  global  data  structure.  Each  option  is  de¬ 
clared  as  a  dynamic  Lisp  variable,  and  a  list  of  all  options  and  their  default  values  is  kept  in 
an  object.  The  #&name  reader  macro  expands  into  a  reference  of  the  option  named  name. 

Concurrent  Smalltalk  symbols  are  not  accumulated  in  any  data  structure;  however,  when  a 
Lisp  symbol  is  used  as  a  Concurrent  Smalltalk  symbol,  its  cst-symbol  property  is  set  to  the 
Concurrent  Smalltalk  symbol  object  to  ensure  that  that  object  is  reused  if  the  symbol  is  ref¬ 
erenced  again;  otherwise,  (eq  '  sym  '  sym)  would  be  false  according  to  the  interpreter. 
Number  objects  are  not  reused,  so  (eq  13  13)  is  false  according  to  the  interpreter’,  but 
(clet  ((X  13))  (eq  x  x)  )  is  true. 


’Nevertheless,  compiled  code  will  currently  return  true  if  eq  is  used  to  compare  two  equal  integers.  The  action  of 
eq  on  numbers  is  purposely  not  denned  in  Concurrent  Smalllolk  to  allow  an  implementation  of  a  bignum  package. 
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3.2.  Initial  Phase 

The  initial  phase  of  the  compiler  reads  the  Concurrent  Smalltalk  input  and  converts  it  into  a 
rough  hcode  form.  Several  early  transformations  have  to  be  done  on  the  resulting  hcode  be¬ 
fore  it  becomes  suitable  for  optimizations. 

The  most  complicated  early  transformations  create  statically  scoped  functions.  The  initial 
phase  determines  parameter  interfaces  for  lexical  variable  displays  [3]  used  by  closures,  and 
it  does  a  considerable  amount  of  work  to  pick  those  interfaces  well.  Delaying  this  decision 
would  have  made  manipulation  of  functions  in  that  stage  very  difficult;  the  advantages  of 
splitting  nested  functions  into  components  early  are  that  every  function  is  self-contained  and 
completely  owns  its  local  variables — no  other  function  can  alter  or  examine  the  local  vari¬ 
ables. 

Reader 

A  customized  Common  Lisp  reader  is  used  to  read  the  Concurrent  Smalltalk  programs.  The 
customizations  consist  of  using  a  special  readtable  and  reading  all  Concurrent  Smalltalk 
names  into  the  CST  package.  The  readtable  is  used  to  implement  the  special  characters  in 
the  Concurrent  Smalltalk  syntax.  Most  special  characters  expand  into  lists;  for  example,  !  a 
expands  into  ( !  a) .  Some  character  tokens  such  as  : ,  :  : ,  and  ,  (comma)  expand  into  sym¬ 
bols  with  the  same  names. 

The  CST  package  is  used  to  prevent  conflicts  between  Concurrent  Smalltalk  symbols  and  any 
symbols  the  compiler  or  the  Common  Lisp  environment  might  be  using.  For  instance,  nil  is 
just  the  name  of  a  constant  (which  happens  to  have  the  value  '  nil)  in  Concurrent  Smalltalk; 
nil  is  not  confused  with  the  Lisp  nil,  which  also  represents  an  empty  Lisp  list.  Since  the 
colon  has  a  special  readtable  meaning  in  Concurrent  Smalltalk  mode.  Concurrent  Smalltalk 
symbols  are  restricted  to  the  CST  package. 

Read  macros  have  been  inserted  into  both  the  Common  Lisp  readtable  and  the  Concurrent 
Smalltalk  one  to  facilitate  easy  switching  between  the  two  tables.  The  #  $  macro  in  standard 
Lisp  input  reads  the  next  token  in  Concurrent  Smalltalk  mode,  while  #''  can  be  used  inside  a 
#  $-expression  to  switch  back  to  Lisp  mode.  In  addition,  the  #L  macro  in  Concurrent 
Smalltalk  mode  reads  a  list  expression  and  returns  a  two-element  list  with  the  symbol  lisp  as 
its  first  element  and  the  expression  read  as  the  second. 


Parser 

The  parser  parses  the  input  expressions  into  a  prototypical  hcode  form.  The  parser  is  a  re¬ 
cursive  descent  macro  evaluator.  Each  primitive  in  Concurrent  Smalltalk  is  implemented  as 
a  macro.  There  are  three  main  kinds  of  macros:  normal  macros  substitute  Concurrent 
Smalltalk  text  with  other  literal  Concurrent  Smalltalk  text  as  described  in  Section  A 14,  non¬ 
terminal  macros  substitute  Concurrent  Smalltalk  text  with  Concurrent  Smalltalk  text  pro¬ 
duced  by  a  Lisp  function,  and  terminal  macros  read  Concurrent  Smalltalk  text  and  perform 
an  action  such  as  emitting  hcodes.  Furthermore,  macros  can  be  restricted  to  evaluate  at  the 
top  level  only. 

The  parser,  when  asked  to  parse  an  expression,  compares  it  against  macros  in  its  macro  list 
in  reverse  chronological  order  until  it  finds  a  match;  when  a  match  occurs,  the  macro  is  ex¬ 
panded  as  above.  If  the  macro  was  not  a  terminal  one,  the  resulting  text  is  expanded  again 
until  either  no  macro  matches  the  text  or  a  terminal  macro  is  expanded.  If  no  macro  applies, 
the  text  must  be  a  symbol,  which  is  looked  up  in  the  current  lexical  environment.  If  the  sym¬ 
bol  is  not  found  in  the  current  environment,  it  is  assumed  to  be  an  undefined  global  unless  it 
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happens  to  be  one  of  the  Concurrent  Smalltalk  primitive  names  or  the  warn-f  ree-refer- 
ences'  declaration  is  in  effect,  in  which  case  an  error  or  a  warning  is  given. 

Macro  Implementation 

Since  the  parser  is  an  intensive  user  of  macros,  a  fast  implementation  of  macros  is  used  to 
make  the  parser  in  the  compiler  fast.  Macros  are  stored  in  linked  lists  hashed  by  the  first 
non-variable  symbol  in  the  macro  pattern;  macros  with  no  such  symbols  are  stored  in  a  sepa¬ 
rate  list.  Thus,  relatively  few  macros  have  to  be  examined  for  a  given  piece  of  Concurrent 
Smalltalk  text.  Furthermore,  the  macros  themselves  are  compiled  Lisp  functions  that  check 
that  their  patterns  are  satisfied  and,  if  so,  compute  the  text  replacement  or  perform  their 
terminal  actions.  Compiling  macros  avoids  the  costly  interpreted  unification  step  during  pat¬ 
tern  matching.  The  make-macro-text  function  in  the  Environment  file  compiles  a  macro  into  a 
Lisp  function. 

If  a  macro  contains  an  @  directive  in  its  pattern,  the  macro  expander  calls  itself  recursively 
on  the  text  matching  the  @  directive.  In  this  case  it  does  not  allow  terminal  macro  expansion 
on  that  text. 

Environments 

While  the  parser  is  generating  code,  it  frequently  needs  to  determine  the  meanings  of  identi¬ 
fiers.  It  uses  linked  environments  to  keep  track  of  statically  scoped  identifiers  such  as  the 
names  of  local  variables  and  continuations.  The  last  local  environment  is  linked  to  the  global 
environment  to  cause  a  search  of  the  global  and  class  environments  when  an  identifier  is  not 
defined  locally.  Optimist  II  distinguishes  local  variables  according  to  whether  they  are  eq  to 
each  other  or  not.  Thus,  no  alpha-renaming  is  necessary  anywhere  in  the  parser.  Also,  a 
lambda  may  reference  local  variables  it  captured  from  an  enclosing  lambda.  Since  most  of 
the  optimizations  cannot  handle  externally  visible  local  variables,  such  local  variables  are 
“unshared”  before  the  optimization  pass  is  invoked. 

Concurrent  Smalltalk  Runtime 

Most  of  the  Concurrent  Smalltalk  directives  described  in  Appendix  A  are  macros  which  ex¬ 
pand  into  either  other  Concurrent  Smalltalk  primitives  or  hidden  primitives.  The  Runtime 
file  contains  a  listing  of  all  macros  used  by  Concurrent  Smalltalk. 

Top-Level  Primitives 

Most  Concurrent  Smalltalk  top-level  primitives  listed  in  Appendix  A  expand  into  the  directive 
hcodes  and  are  evaluated  at  expression  interpretation  time.  Directive  hcodes  may  be  inter¬ 
preted  but  not  compiled;  to  ensure  that  no  directive  will  be  compiled,  directives  are  prohibited 
inside  lambdas  (and,  of  course,  any  constructs  which  expand  into  lambdas).  A  few  directives 
such  as  include,  top-level  set,  and  def  class^  are  evaluated  by  the  reader;  those  directives 
must  be  placed  at  the  top  level — ^they  may  not  be  nested  in  any  expression  except  a  top-level 
begin,  which  evaluates  its  arguments  sequentially  at  the  top  level. 

Method-Lambdas 

A  method-lambda  of  a  class  c  expands  into  a  lambda  with  a  formal  self  of  type  C  prepended  to 
the  method-lambda’s  formats  and  a  (_with-object  (self;c)  ...)  form  surrounding  the  body  of  the 
lambda.  The  _with-object  form  establishes  bindings  in  the  parser’s  environment  that  associ- 


^Sce  Appendix  B. 

^Defclass  isn’t  really  evaluated  by  the  reader;  nevertheless,  it  must  be  a  top-level  form  because  it  expands  into  a 
top-level  begin  containing  the  internal  class  definition  followed  by  definitions  of  accessor  and  predicate  methods. 
The  internal  class  definition  has  to  have  been  interpreted  before  the  accessor  method  definitions  are  read',  otherwise, 
the  reader  will  complain  about  an  undefined  class.  Grouped  forms  not  at  the  top  level  and  not  in  a  top-level  begin 
are  read  as  a  group  and  then  interpreted  as  a  group. 
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ate  names  of  c’s  instance  variables  to  ivar-refs  of  the  corresponding  instance  variables  pointed 
by  the  self  object.  The  action  of  _with-object  is  analogous  to  that  of  the  symbol-macrolet  con¬ 
struct  in  CLOS  [6]. 

Optimist  II  does  not  restrict  a  lambda  to  referencing  only  one  instance  object;  in  fact,  through 
inlining  of  method-lambdas  or  accessor  methods,  a  lambda  can  reference  many  objects  at  the 
same  time.  Objects  may  also  be  referenced  through  the  use  of  _with-object  directly  in  Con¬ 
current  Smalltalk  code,  but  this  practice  is  discouraged,  as  it  uses  a  nonstandard  feature  of 
the  language  and  gains  no  real  functionality. 

Loops 

Although  Optimist  II  can  optimize  and  output  code  with  loops  in  it,  loops  are  currently  not 
implemented  this  way.  The  problem  is  that  a  Concurrent  Smalltalk  function  with  a  loop  in  it 
might  execute  for  a  long  time  and  not  allow  any  other  messages  to  be  processed  at  its  node. 
To  prevent  this  problem,  loops  are  implemented  as  closures  which  pass  themselves  as  argu¬ 
ments —  (while  (<  i  10)  (set  i  (+  i  1))  expands  into: 

(clet  ( (_loop 

(lambda  ( (_loop-arg: function  4no-leak) ) : : (_while) 

(if  (<  i  10) 

(set  i  (+  i  1)  ) 

(return  _while  ’nil)) 

(_loop-arg) ) ) ) 

(_loop  _loop) ) 

The  _loop  function  is  called  and  passed  itself  as  an  argument.  If  i  is  less  than  10,  _loop 
increments  i  and  calls  its  argument  tail-recursively;  otherwise,  it  returns  nil  to  the  caller. 
The  tail-recursive  call  breaks  the  long  invocation  of  the  function. 

The  compiler  is  not  yet  sophisticated  enough  to  detect  that  the  value  of  the  _loop  variable 
never  changes,  so  the  _loop-arg  argument  to  the  internal  function  can  be  eliminated  and 
the  function  could  call  itself  recursively  directly. 

Initial  Transformations 

Immediately  after  the  hcode  is  created  by  the  parser,  a  transformation  and  an  optimization 
are  done  on  it.  The  first  transformation  flattens  all  exit  hcodes  out  of  every  newly  created 
lambda.  Exit  hcodes  are  generated  by  the  exit  Concurrent  Smalltalk  primitive,  which  may 
also  be  a  result  of  the  expansion  of  a  return  statement.  Each  exit  hcode  in  the  lambda  is 
removed  and  the  preceding  statement  linked  to  the  digraph’s  root  dinode  to  indicate  that  the 
execution  of  the  lambda  should  terminate  at  that  point.  Sometimes  exit  hcodes  can  be  found 
nested  inside  nconcurrently  hcodes;  if  that  is  the  case,  the  exit  flattener  moves  as  many  of  the 
nconcurrently’s  threads  outside  as  it  needs  to  remove  all  exit  hcodes  from  the  nconcurrently. 
Then  it  flattens  the  exits  as  usual.  An  example  is  shown  in  Figure  3-2. 

Simple  structural  optimizations  are  done  immediately  after  the  exits  are  flattened.  These 
optimizations  do  not  depend  on  dataflow  analysis  and  can,  therefore,  be  done  before  lexical 
variables  are  untangled.  The  optimizations  consist  of  the  following  transformations: 

•  If  statements  with  identical  consequents  and  alternatives  are  deleted. 

•  If  statements  conditioned  on  constants  are  deleted,  and  resulting  dead  code,  if  any,  elimi¬ 
nated. 

•  Move  statements  with  identical  sources  and  destinations  are  deleted. 

•  Assert-type  statements  on  constants  are  checked  and  deleted.  The  compiler  generates  an 
error  if  an  assertion  fails. 
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Figure  3-2.  Exit  Flattening  Example 

Exit  statements  are  inserted  by  the  parser  in  all  places  in  which  the  execution  of  a  lambda  should  terminate.  As 
the  first  transformation,  those  exit  statements  are  removed  and  replaced  with  links  back  to  the  root  of  a  digraph. 
For  example,  part  (a)  shows  the  main  body  of  a  lambda  with  two  sub-digraphs  that  are  the  threads  of  a  nconcur- 
rently.  After  exit  removal  (b),  all  exit  paths  are  linked  back  to  the  root  of  the  main  body  of  the  lambda,  which  also 
required  the  inlining  of  one  of  the  nconcurrently's  threads. 

•  Touch  and  force  statements  on  constants  are  deleted. 

•  Empty  nconcurrently  statements  are  deleted. 

•  One-thread  nconcurrently  statements  are  replaced  by  their  threads. 

The  structural  optimizations  are  done  for  two  reasons:  First,  structural  optimizations 
shorten  the  hcode,  using  less  memory  in  the  later  compiler  stages  and  making  them  run 
faster.  Second,  structural  optimizations  may  remove  some  variable  references,  improving  the 
quality  of  the  code  produced  by  lambda-collapsing  and  the  nconcurrently  flattener  in  the  op¬ 
timization  phase. 
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Lambda-Collapsing 

Lambda-collapsing  is  the  process  of  unnesting  nested  lambdas.  After  lambda-collapsing, 
each  lambda  has  exclusive  access  to  its  local  variables.  Lambda-collapsing  becomes  difficult 
when  the  inner  lambdas  reference  the  outer  lambdas’  local  variables  and  continuations. 
Since  continuations  are  restricted  local  variables,  they  will  not  be  discussed  here  further. 
Lumbda-coilapsing  occupies  niost  of  the  Preoptimizer  file.  Since  lambda-collapsing  is  a  com¬ 
plex  process,  an  illustrative  example  is  provided  at  the  end  of  this  section. 

The  lambda-collapser  (the  assign-lexicals  Lisp  function)  examines  each  outermost  lambda  in 
the  hcode  produced  by  the  initial  transformations.  For  each  outermost  lambda  L  it  looks  at 

the  lambdas  Nf,  N2 . N|(  nested  in  L  and  their  free  variables.  Each  nested  lambda  Nj  is 

considered  to  also  include  any  lambdas  nested  in  it.  Thus,  if,  say,  N2  contains  a  lambda  N2,i 
that  references  a  variable  x  that  is  not  defined  in  N2,i  or  N2,  then  x  is  a  free  variable  of  both 
N2,i  and  N2.  If  a  nested  lambda  Nj  does  not  reference  any  free  variables,  it  is  a  self-contained 
lambda  and  a  first-class  data  object  and  does  not  present  any  difficulties  here.  Otherwise,  Nj 
is  the  code  portion  of  a  closure. 

The  lambda-collapser  first  calculates  the  sets  of  free  variables  read  and  written  by  Nj.  Next, 
the  lambda-collapser  considers  each  local  variable  Xj  of  L.  A  local  Xj  is  called  a  mutable  lexical 
if  it  is  either  (1)  written  by  any  Nj  or  (2)  read  by  any  closure  Nj  and  written  by  L  after  the  clo¬ 
sure  Nj  has  been  created  by  L  and  before  the  closure  was  called  for  the  last  time.  Mutable 
lexicals  of  the  first  kind  are  easy  to  determine  by  scanning  every  Nj  and  checking  which  free 
variables  are  written  in  any  hcode  in  it.  To  determine  mutable  lexicals  of  the  second  kind, 
the  lambda-collapser  solves  a  few  dataflow  problems  on  L.  In  effect,  to  each  variable  Xj  in  L, 
it  assigns  a  state  machine  Sj  (Figure  3-3)  and  uses  the  dataflow  problem  solver  to  run  Sj 
through  all  possible  control  paths  in  L.  If  Sj  ever  enters  state  4,  Xj  is  a  mutable  lexical  of  the 
second  kind.  The  state  machine  assumes  that  any  local  variable  Xj  that  is  modified  after  the 
creation  of  a  live*  closure  which  reads  Xj  is  a  mutable  lexical.  Since  the  compiler  cannot  cur¬ 
rently  determine  when  a  lambda  finishes  executing,  it  cannot  optimize  local  variables  that 
are  modified  by  L  only  after  the  closures  have  completed  execution. 


Figure  3-3.  Lexical  Variable  State  Machine 

Each  local  variable  starts  in  state  0  at  the  beginning  of  the  lambda.  For  each  local  variable  every  possible  path  of 
control  flow  is  traversed  and  a  state  updated  as  above.  It  ^e  variable  ever  enters  state  4,  it  must  be  a  mutable 
lexical  of  the  second  kind — the  variable's  value  cannot  be  saved  with  the  closure  when  the  closure  is  made 


*If  the  closure  is  not  called,  it  is  not  a  live  closure,  and  the  variable  is  not  necessarily  a  mutable  lexical. 
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Any  variable  that  is  free  in  one  of  the  lambdas  Nj  and  is  not  a  mutable  lexical  is  an  im¬ 
mutable  lexical.  Once  all  of  the  free  variables  in  the  sublambdas  of  L  have  been  classified, 
the  lambdas  are  separated. 

Each  sublambda  Nj  of  L  that  has  free  variables  is  assigned  a  number  of  display  parameters  in 
addition  to  the  normal  parameters  it  has.  The  values  of  the  display  parameters  are  deter¬ 
mined  at  the  time  a  closure  of  Nj  is  created.  Immutable  lexicals  are  stored  directly  in  the  dis¬ 
play,  while  mutable  lexicals  are  stored  in  an  object  whose  pointer  is  passed  in  the  display. 
More  than  one  such  object  may  be  present  if  Nj  uses  mutable  lexicals  from  several  levels  of 
enclosing  lambdas. 

Once  the  display  parameters  are  assigned  to  the  sublambdas,  the  code  of  L  is  modified  to 
store  the  display  parameters  into  a  closure  whenever  one  is  created,  and  the  Nj’s  are  modified 
to  use  the  display  parameters  instead  of  referencing  L’s  locals  directly.  If  L  has  any  mutable 
lexicals,  it  creates  an  object  containing  them  upon  entry  and  treats  mutable  lexicals  as  if  they 
were  instance  variables  of  that  object;  any  mutable  lexicals  that  are  also  parameters  of  L  are 
copied  into  that  object  as  soon  as  it  is  created.  The  object  containing  mut^le  lexicals  is  itself 
mutable,  so  only  one  copy  of  it  per  invocation  of  L  can  be  present  on  the  J-Machine.  The  ob¬ 
ject  is  not  disposed  because  Optimist  II  cannot  determine  the  temporal  lifetime  of  a  closure; 
the  object  and  the  closures  have  to  be  garbage-collected. 

After  the  above  transformation,  L  has  exclusive  access  to  its  locals.  Since  some  of  the  Nj’s 
could  themselves  have  locals  used  by  their  sublambdas,  the  lambda-collapser  calls  itself  re¬ 
cursively  on  every  lambda  and  closure  ''ontained  in  L,  even  if  that  lambda  did  not  have  any 
external  free  variables. 

Efficiency  Considerations 

There  are  several  advantages  for  using  immutable  lexicals  instead  of  mutable  lexicals: 

•  Immutable  lexicals  are  stored  directly  in  a  closure’s  display,  so  the  closure  has  immediate 
access  to  their  values. 

•  Closures  are  immutable  objects.  If  many  closures  are  executing  simultaneously,  many 
copies  of  the  closures  and  their  immutable  lexicals  can  be  made.  On  the  other  hand,  if  many 
copies  of  a  closure  with  a  mutable  lexical  are  executing,  the  copies  will  be  contending  for  the 
single  object  containing  that  lexical’s  current  value. 

•  The  outer  lambda  can  store  immutable  lexicals  in  its  context  or  in  registers,  while  it  has 
to  allocate  an  object  for  mutable  lexicals  and  keep  their  values  there. 

In  order  to  ensure  that  lexically  scoped  variables  are  immutable  lexicals,  the  programmer 
should  check  that  their  values  are  not  altered  after  any  closures  which  might  reference  them 
are  created. 

Example 

Consider  the  following  code: 
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(defun  outer  (x) 

(clet  (  (y  3) 

(z  4) 

(t  D) 

(clet  ((innerl 

(lambda  () 

((lambda  ()  (cset  x  z) ) )  ;innerll 
(write  X  y) ) ) 

(inner2 

(lambda  () 

(write  y) ) ) 

(inner3 

(lambda  (a) 

(write  a)  ) )  ) 

(if  (zero?  x) 

(innerl) 

(cset  X  5) ) 

(cset  z  3) 

(inner  2) 

(write  X  y  z  t) ) ) ) 

The  lambda-collapser  first  determines  that  the  outer  lambda  has  no  free  variables,  so  it  is 
made  into  a  normal  function  instead  of  a  closure.  Next  it  examines  the  three  sublambdas 
within  outer:  innerl,  inner2,  and  innerS.  Innerl  will  become  a  closure  because  it  has  three  free 
variables,  x,  y  and  z.  It  writes  to  x,  so  x  becomes  a  mutable  lexical;  although  innerl  does  not 
write  to  y  and  z,  another  lambda  might,  so  y’s  and  z’s  statuses  are  unknown.  Inner2  will  also 
become  a  closure  because  it  has  one  free  variable,  y,  whose  status  is  still  unknown.  Since  irv 
ner3  has  no  free  variables,  it  becomes  a  normal  function. 

Next  the  lambda-collapser  runs  the  state  machines  on  the  x,  y,  and  z  locals  in  outer;  outer 
also  has  other  locals  such  as  t,  innerl ,  inner2,  and  innerS,  but  those  are  not  referenced  by  any 
inner  lambdas.  X  is  already  known  to  be  a  mutable  lexical  of  the  first  kind.  Y  is  not  written 
anywhere  after  innerl  and  inner2  are  created,  so  it  is  an  immutable  lexical.  Z  is  written  after 
the  innerl  closure  is  created,  and  the  compiler  makes  it  a  mutable  lexical  of  the  second  kind. 
Unfortunately,  the  compiler  does  not  realize  that  Z  is  altered  only  after  innerl  finishes  t  e- 
cuting;  if  it  were  smarter,  it  could  have  made  z  an  immutable  lexical.  Finally,  the  lambda- 
collapser  creates  the  displays  and  alters  the  code  of  the  lambdas  to  produce  a  parameter¬ 
passing  pattern  shown  in  Table  3-6. 


Table  3-6.  Lambda-Collapser  Example  Results 


Outer 

Innerl 

Innerl  1 

Inner2 

InnerS 

Parameters 

X  (copied  into 

lexical-object) 

a 

Returns 

continuation-0 

continuation-1 

continuation-2 

continuation's 

continuation-4 

Display 

lexical-object 

y 

Locals 

y 

t 

innerl 

inner2 

innerS 

lexical-object 

a 

lexical-object 

Instance  Variables 

X 

z 
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Top-Level  Evaluator 

Lambda-collapsing  was  the  last  preliminary  hcode  transformation.  At  this  point  the  hcode  is 
in  a  format  understood  by  the  interpreter.  If  it  found  no  syntax  errors,  Optimist  II  now  eval¬ 
uates  the  Concurrent  Smalltalk  expression  it  just  read  by  running  the  expression’s  hcodes 
through  the  hcode  interpreter.  If  the  expression  contained  any  directives,  the  interpreter  ex¬ 
ecutes  them  at  this  time. 

Interpreter 

The  interpreter  is  a  simple  hcode  interpreter  for  executing  Concurrent  Smalltalk  programs. 
The  interpreter  is  completely  sequential.  Except  for  full  futures  and  some  unimplemented 
input/output  facilities,  the  interpreter  is  a  valid  Concurrent  Smalltalk  implementation — the 
Concurrent  Smalltalk  definition  allows  cfutures  to  be  touched  at  the  implementation’s  dis¬ 
cretion,  so  a  completely  sequential  Concurrent  Smalltalk  interpreter  trivially  “touches”  each 
cfuture  as  soon  as  it  is  created.  While  the  interpreter  never  achieves  any  parallelism,  it 
couldn’t  use  parallelism  if  it  had  any  because  it  is  running  on  a  sequential  computer. 

The  interpreter  in  Optimist  II  was  provided  for  three  reasons: 

•  It  is  a  powerful  constant  expression  evaluator  for  expressions  encountered  while  compil¬ 
ing  Concurrent  Smalltalk  programs. 

•  It  is  the  most  interactive  Concurrent  Smalltalk  environment,  allowing  methods  and  func¬ 
tions  to  be  changed  almost  instantly. 

•  It  permits  debugging  of  Concurrent  Smalltalk  programs  before  they  are  compiled  into 
MDP  assembly  language. 

•  It  maintains  the  Concurrent  Smalltalk  global  environment  and  permits  interactive  exam¬ 
ination  of  that  environment. 

Currently  the  interpreter  can  only  interpret  unoptimized  hcode;  however,  a  bypass  hcode 
path  could  be  added  to  transfer  optimized  hcode  back  to  the  interpreter.  This  bypass  is  not 
quite  as  simple  as  it  sounds  because  the  format  of  continuations  changes  during  optimiza¬ 
tion. 
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3.3.  Optimization 

As  long  as  no  MDP  code  output  is  desired,  Optimist  II  does  not  leave  its  first  phase.  Only 
when  a  compile  command  is  issued  does  Optimist  II  enter  its  second  phase,  its  first  goal  be¬ 
ing  to  determine  just  what  it  should  compile.  Every  compile  command  requires  a  root  set  of 
objects  that  should  be  compiled.  The  compiler  uses  the  treewalker  to  automatically  deter¬ 
mine  the  minimum  amount  of  code  that  has  to  be  compiled  and  loaded  in  order  to  permit 
running  the  functions  in  the  root  set  on  the  J -Machine. 


Treewalker 


The  root  set  specified  in  the  compile  command  is  passed  to  the  treewalker,  which  appends  it 
to  its  own  permanent  root  set  of  objects  which  must  always  be  compiled  (Table  3-7).  The 
treewalker  then  calls  the  optimizer  on  each  code  object  in  its  set  and  scans  the  optimized 
hcode  (if  the  object  is  not  code,  the  treewalker  scans  it  directly).  If,  while  scanning,  it  en¬ 
counters  an  object  not  in  its  current  set  of  objects,  it  adds  that  object  to  its  set,  optimizes  it  if 
necessary,  and  scans  it.  The  process  continues  until  every  object  referenced  by  any  object  in 
the  treewalker’s  set  is  also  in  that  set.  At  that  point  the  second  phase  of  the  compiler  has 
completed  and  the  treewalker  calls  the  compiler’s  third  stage  to  compile  and  assemble  each 
object  in  the  set  and  print  the  resulting  MDPSim  code  into  a  text  file. 


Table  3-7.  Permanent  Root  Objects 


_closure 

boolean 

character 

t :class 

context 

# :continuation 

displacement 

distob j 

distributed -cl  ass 

« : false 

float 

funct 

function 

global 

integer 

magnitude 

null 

number 

object 

primitive-class 

real 

selector 

standard-class 

symbol 

# : true 

These  objects  are  emitted  in  the  output  assembly  file  regardless  of  which  objects  were  compiled.  _closure, 
context,  displacement,  #  :continuation,  and  global  are  internal  Optimist  II  classes. 


Calling  the  Optimizer 

The  optimizer  is  called  simply  by  requesting  the  value  of  the  hcode  or  mdp-hcode  CLOS  slot 
in  a  Concurrent  Smalltalk  lambda  (cst-lambda).  If  the  lambda  has  already  been  optimized, 
these  slots  contain  the  optimized  hcode  and  hcode  optimized  for  the  MDP,  respectively.  If 
not,  those  slots  are  unbound,  and  CLOS  calls  the  optimizer  to  calculate  their  values.  Thus,  a 
lambda’s  optimized  hcode  can  be  requested  repeatedly  by  the  treewalker  or  the  optimizer 
without  a  performance  penalty.  To  prevent  infinite  loops,  a  semaphore  keeps  a  function  op¬ 
timizing  a  lambda  from  requesting  that  lambda’s  optimized  hcode.  One  of  the  consequences 
of  this  rule  is  that  a  function  may  not  be  inlined  inside  itself. 


Guide  to  Optimizations 

The  transformations  done  by  the  optimizer  are  summarized  in  Figure  3-4.  The  transforma¬ 
tions  can  be  divided  roughly  into  two  classes:  general  hcode  optimizations  and  MDP-specific 
optimizations  and  transformations.  The  general  optimizations  occupying  the  first  half  of  the 
optimizer  produce  optimized  hcode.  If  MDP  assembly  code  output  is  desired,  the  second  half 
of  the  optimizer  is  invoked  to  convert  a  number  of  hcode  constructs  into  simpler,  MDP-spe¬ 
cific  ones.  For  example,  the  second  half  of  the  optimizer  converts  globals  into  references  to 
global  objects,  CAS  built-ins  into  code  that  explicitly  compares  and  sets  values,  and  three-ar¬ 
gument  sums  into  two  two-argument  sums.  The  order  of  optimization  is  critical;  expansion 
of  CASes  into  compare-and-set  code  could  not  have  been  done  in  the  first  half  of  the  optimizer 
because  there  was  no  way  to  assure  its  atomicity. 
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Type 

Special  izer 


Move 

Eliminator 


Touch 

Eliminator 


Dataflow 

Optimizer 


Constant 

Folder 


Tail 

Forwarder 


Structural 

Optimizer 


Fork 

Merger 


Join 

Merger 


Copier  and 
Parameter  Substituter 


Structural 

Optimizer 


Nconcurrently 

Flattener 


Continuation 

Expander 


Iterative 

Optimizer 


Function 

Inliner 


Iterative 

Optimizer 


Local 

Eliminator 


Optimized  Hcode 


Addressing  Mode 
Flattener 


Statement 

Splitter 


Structural 

Optimizer 


Built-in 

Optimizer 


Touch 

Eliminator 


IVar-Target 

Transformer 


Grab 

Introducer 


Cfuture  Parameter 
Eliminator 


Local 

Eliminator 


Enter/Exit 

Introducer 


MDP-Specific  Hcode 

Figure  3-4.  Optimizer  Organization 

The  first  few  filters  convert  the  hcode  produced  by  the  initial  phase  into  a  format  usable  by  the  optimizer.  The  it¬ 
erative  optimizer  and  function  inliner  perform  the  major  optimizations  The  remaining  filters  implement  some  Con¬ 
current  Smalltalk  features  out  of  more  basic  ones  and  fix  a  lew  quirks  in  the  Cosmos  and  MOP  architectures. 
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The  transformations  new  in  Optimist  II  will  be  described  in  the  order  in  which  they  are  per¬ 
formed.  The  following  optimizations  will  not  be  described  here  because  they  were  present  in 
Optimist  and  their  concepts  have  not  changed  significantly: 

•  Dead  Definition  Eliminator 

•  Move  eliminator 

•  Touch  eliminator 

•  Tail  forwarder 

•  Fork  merger 

•  Join  merger 

Also,  the  structural  optimizer  was  used  in  the  first  phase  of  the  compiler  and  is  described 
there. 


Preparatory  Transformations 

Lambda  Copier  and  Structural  Optimizer 

Before  optimizing  a  lambda,  the  optimizer  first  makes  a  copy  of  it  to  avoid  destroying  the 
copy  used  by  the  global  environment  and  the  interpreter.  While  copying  the  lambda,  the  op¬ 
timizer  assigns  consecutive  indices  to  the  lambda’s  local  variables.  These  indices  will  allow 
the  use  of  fast  bitmaps  to  represent  local  variable  data  during  later  dataflow  analysis  rou¬ 
tines. 

At  this  stage  the  optimizer  replaces  all  references  to  parameters'  with  their  values.  Once  the 
compilation  process  has  begun,  values  of  parameters  cannot  change,  and  replacing  pa¬ 
rameters  with  constants  as  early  as  possible  enables  early  constant  folding  and  dead  code 
elimination.  Parameters  are  usually  used  to  hold  global  functions  and  compiler  conditionals 
such  as  a  debugging  flag.  Debugging  code  can  be  compiled  conditionally  by  enclosing  it 
within  an  if  statement  conditioned  on  a  debug  parameter.  If  debug  is  false,  the  code  and  the 
if  statement  are  removed  by  the  structural  optimizer  immediately  following  the  copier;  the 
remaining  optimizations  don’t  even  see  that  code.  Dead  code  is  best  removed  early  because 
removing  it  enlarges  basic  blocks,  permits  additional  function  inlining,  and  improves  the  per¬ 
formance  of  the  dataflow  optimizer  and  tail  forwarder.  It  is  unfortunate  that  conditional  de¬ 
bugging  code  cannot  be  removed  before  lambda  collapsing,  but  doing  that  would  prevent 
changes  in  the  debug  parameter  from  having  any  effect  on  existing  code. 

The  structural  optimizer  cleans  the  code  to  give  the  nconcurrently  flattener  maximum  lati¬ 
tude  in  scheduling  nconcurrently  hcodes. 

Nconcurrently  Flattener 

The  nconcurrently  flattener  remo  'es  nconcurrently  hcodes  from  the  lambda  being  optimized. 
Later  optimizations  run  many  dataflow  calculations  on  the  lambda,  and  the  presence  of 
nconcurrentlys  would  complicate  dataflow  analysis  and  make  some  optimizations  less  effec¬ 
tive.  In  the  interest  of  compiler  simplicity  1  decided  to  remove  nconcurrentlys  at  this  stage. 

The  nconcurrently  flattener  uses  a  heuristic  to  interleave  the  nconcurrentlys  it  is  flattening. 
If  it  finds  a  nconcurrently  statement  with  more  than  one  thread,  it  first  calls  itself  recursively 
on  each  thread  and  then  separates  each  thread  into  a  leading  and  a  trailing  set  of  state¬ 
ments.  A  thread’s  trailing  set  of  statements  contains  the  longest  string  of  consecutive  hcodes 
at  the  end  of  the  thread  which  are  not  considered  worth  advancing  relative  to  other  hcodes  in 
the  lambda.  The  trailing  set  cannot  contain  any  forks  or  joins  of  flow-of-control  paths.  All 
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In  this  paragraph  parameters  means  parameter  globals  defined  in  Section  A.ii,  not  function  parameters. 
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other  statements  in  the  thread  are  placed  in  the  thread’s  leading  set.  Once  the  nconcurrently 
flattener  separates  each  thread  into  the  two  sets,  it  replaces  the  nconcurrently  hcode  with  all 
leading  sets  concatenated  together  followed  by  all  trailing  sets  concatenated  together. 

Hcodes  worth  advancing  are  non-built-in  function  and  method  calls  and  any  hcodes  which  re¬ 
turn  values  through  continuations;  all  other  hcodes  are  not  considered  worth  advancing. 
Hcodes  not  worth  advancing  are  pushed  as  far  back  as  possible  by  the  nconcurrently  flat¬ 
tener,  which  displaces  hcodes  worth  advancing  forward. 

The  nconcurrently  flattener  could  use  more  complicated  heuristics  to  increase  parallelism. 
For  example,  it  could  realize  that  no  matter  how  it  orders  function  calls  in  statements  such 
as  (f  (a  (b  1))  (c  (d  2)  )),  there  would  remain  a  possibility  of  a  loss  of  concurrency 
caused  by  touching  the  intermediate  results  (b  1)  and  (d  2)  in  the  wrong  order.  Hence,  it 
could  split  the  calculation  of,  say,  (a  (b  1) )  into  a  separate  function  call  to  avoid  a  poten¬ 
tial  loss  of  concurrency.  Nevertheless,  the  nconcurrently  flattener’s  current  heuristic  seems 
adequate. 

Continuation  Expander 

The  continuation  expander  is  the  one  MDP-specific  transformation  that  is  done  early.  So  far 
in  the  compiler,  continuations  have  been  represented  as  single  words,  while  on  an  MDP  a 
continuation  is  two  words — the  context  to  which  the  continuation  is  pointing  and  an  offset  of 
a  slot  within  the  context  where  the  return  value  should  be  stored.  I  originally  planned  to 
implement  continuations  as  a  special  case  of  inline  objects,  but  writing  a  general  implemen¬ 
tation  of  inlined  objects  would  have  been  too  time-consuming  and  inappropriate  for  an  initial 
version  of  the  compiler.  Hence,  I  included  a  partial  implementation  of  inline  objects  that 
only  inlines  continuations. 

The  continuation  expander  expands  each  local  variable  of  type  continuation  into  two  vari¬ 
ables,  one  of  type  context  and  the  other  of  type  displacement.  Similarly,  each  formal  and 
display  parameter  typed  continuation  is  made  to  correspond  to  two  local  variables.  A 
move  hcode  moving  a  continuation  is  changed  into  two  moves,  while  an  application  hcode  calls 
its  function  with  both  new  locals  as  arguments. 

Chungking  structures  of  instance  objects  and  global  variables  containing  continuations  is  hard 
at  this  stage  of  compilation,  so  to  avoid  this  problem  continuations  have  not  been  made  first- 
class  objects — there  is  no  way  to  store  continuation  in  an  instance  variable  of  an  object; 
disallowing  programmer-visible  continuation  local  variables  ensures  that  no  continuation  be¬ 
comes  a  mutable  lexical  which  would  get  stored  in  an  instance  object. 

Iterative  Optimizations 

The  iterative  optimizations  perform  general  dataflow  and  constant  propagation  optimiza¬ 
tions.  They  are  called  in  a  loop  until  none  of  them  changes  the  lambda.  All  of  the  optimiza¬ 
tions  were  altered  in  some  way  since  Optimist;  most  had  to  be  updated  to  handle  multiple  re¬ 
turn  values  and  typed  variables,  and  some  were  changed  because  reply  is  no  longer  an  ex¬ 
plicit  hcode.  However,  only  the  new  features  will  be  described  below. 

Type  Specialize! 

Local  variables  in  Optimist  II  are  associated  with  types  in  two  ways: 

1.  The  variable  itself  has  a  type  supplied  by  the  programmer  when  the  variable  is  declared. 
This  type  applies  throughout  the  variable’s  lifetime. 

2.  The  programmer  can  declare  types  through  the  use  of  the  type  assertion  primitive 
(Section  A.6),  or  the  compiler  can  infer  from  its  knowledge  about  the  types  of  function  and 
method  arguments  and  results  that  a  variable  has  a  particular  type  at  a  given  point  in  the 
lambda.  These  type  assertions  apply  only  to  a  particular  point  in  the  variable’s  lifetime. 


41 


Concurrent  Smalltalk  on  the  Message-Driven  Processor 


Each  type  asserted  in  this  manner  must  have  a  non-null  intersection  with  the  variable’s  type; 
otherwise,  no  legal  value  could  be  stored  in  the  variable  and  Optimist  11  generates  an  error. 

The  type  specializer  examines  each  variable  and  calculates  the  tub  of  the  types  it  can  assume 
throughout  its  lifetime,  combining  all  the  knowledge  it  has  from  assertions  of  the  second 
kind.  It  then  intersects  the  variab^le’s  type  with  the  lub  and  makes  that  the  variable’s  new, 
more  restricted  type. 

Type  specialization  is  done  to  improve  the  quality  of  the  move  elimination  optimization  and 
to  permit  inlining  of  values  in  the  future.  When  the  move  eliminator  merges  two  variables,  it 
sets  the  new  variable’s  type  to  the  lub  of  the  variables’  types.  The  temporaries  created  by 
other  optimizations  often  have  type  object  even  though  they  can  contain  only  more  re¬ 
stricted  values,  and  if  one  of  them  were  merged  into  an  existing  variable,  that  variable’s  type 
would  also  become  object  unless  the  temporary’s  type  were  specialized  first  by  the  type  spe¬ 
cializer.  When  Optimist  II  supports  inline  classes  in  the  future,  type  specialization  of  a  vari¬ 
able  to  an  inlineable  class  will  permit  some  objects  such  as  double-precision  floating  point 
numbers  and  locks  to  be  inlined  in  local  variables. 

Dataflow  Optimizer 

The  dataflow  optimizer  has  an  extra  optimization  in  addition  to  those  mentioned  in  [21].  The 
dataflow  optimizer  always  checked  whether  an  if  statement  would  always  branch  one  way 
and  eliminated  the  if  statement  and  the  dead  branch  if  that  is  the  case.  In  addition  to  that 
check,  if  the  if  statement  has  several  predecessors,  the  dataflow  optimizer  now  checks  each 
one  separately  whether  it  would  cause  the  if  statement  to  always  branch  one  way;  if  so,  that 
predecessor  is  connected  directly  to  one  of  the  if  statement’s  branches.  This  situation  arises 
often  when  sc-and  and  sc-or  are  used.  A  code  fragment  like  the  one  below  is  generated  for 
(if  (sc-and  a  b)  (f)): 

(IF  :FALSE  (LOCAL  CST : : A)  2246) 

(MOVE  (LOCAL  387)  (LOCAL  CST::B)) 

(JUMP  2248) 

(LABEL  2246) 

(MOVE  (LOCAL  387)  #<False» 

(LABEL  2248) 

(IF  :FALSE  (LOCAL  387)  2252) 

(APPLY  NIL  (#<Lambda  CST::F>)) 

(LABEL  2252) 

It  is  optimized  to: 

(IF  :FALSE  (LOCAL  CST :: A)  2252) 

(IF  :FALSE  (LOCAL  CST :: B)  2252) 

(APPLY  NIL  (#<Lambda  CST: :F>) ) 

(LABEL  2252) 

Constant  Folder 

'The  constant  folder  performs  two  duties:  it  evaluates  constant  expressions  and  replaces 
method  calls  with  function  calls.  The  constant  folder  examines  each  application  statement  in 
the  lambda.  If  the  arguments  are  all  values,  the  function  or  method  to  be  invoked  is  side-ef- 
fecl-free,  and  the  precise  mode  is  off,  the  constant  folder  calls  the  interpreter  to  evaluate  the 
function  or  method  call  and  replace  it  with  move  statements  of  the  results  to  the  application’s 
targets.  If  the  interpreter  generates  an  error,  the  compiler  aborts  the  compilation;  the  error 
is  not  hidden  until  runtime.  The  call  could  potentially  invoke  many  functions  and  methods. 

One  has  to  be  a  little  careful  with  this  optimization — if  all  inputs  to  a  program  are  specified. 
Optimist  II  is  perfectly  willing  to  precalculate  the  program’s  results  and  compile  the  entire 
program  to  a  single  function  that  returns  the  answer.  This  will  happen  often  on  benchmarks, 


*See  Appendix  B. 
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especially  when  Optimist  II  learns  how  to  automatically  determine  which  functions  are  Side- 
effect-free;  currently  it  assumes  that  a  function  is  not  side-effect-free  unless  explicitly  de¬ 
clared  so  by  the  programmer. 

In  addition  to  evaluating  applications  with  all  arguments  specified,  the  constant  folder  also 
simplifies  built-in  operations  such  as  arithmetic  and  logical  primitives  according  to  the  iden¬ 
tity  rules  listed  in  Table  A-4. 

When  the  constant  folder  encounters  a  method  call,  it  looks  in  the  selector’s  table  of  methods 
and  selects  all  methods  which  match  the  number  and  types  of  arguments  provided.  The  ar¬ 
guments’  types  are  determined  by  dataflow  analysis  of  the  same  information  as  is  used  by 
the  type  specializer.  If  no  methods  match,  the  constant  folder  signals  an  error.  If  exactly  one 
method  matches,  the  constant  folder  replaces  the  method  dispatch  with  a  direct  call  of  the 
method’s  code,  which  may  even  be  inlined  later.  If  two  methods  match,  the  constant  folder 
uses  a  heuristic  to  determine  whether  it  is  better  to  do  a  standard  method  dispatch  or  to  get 
the  type  of  the  first  argument  and  call  one  of  the  two  methods  depending  on  that  type. 

'The  heuristic  is  as  follows:  The  classes  of  the  first  arguments  accepted  by  the  two  methods 
are  determined.  If  the  two  classes  are  disjoint,  the  constant  folder  picks  the  class  that  is 
easier  to  checks  If  one  is  a  subclass  of  the  other,  the  constant  folder  picks  the  subclass; 
otherwise,  the  constant  folder  gives  up  and  does  not  optimize.  If  the  picked  class  is  easier  to 
check  than  doing  a  method  dispatch,  the  constant  folder  replaces  the  application  with  a  call 
to  the  class’s  predicate  followed  by  an  if  statement  with  direct  calls  to  the  two  methods  on  the 
two  sides  of  the  conditional. 

Function  Inlining 

Functions  are  inlined  after  all  iterative  optimizations  have  been  performed  and  can  yield  no 
more  improvements.  To  inline  functions,  the  function  inliner  considers  each  function  call*  in 
the  lambda.  If  the  function  is  not  a  built-in  and  is  declared  inlineable,  the  function  inliner 
attempts  to  inline  it;  however,  there  is  no  a  priori  guarantee  that  it  will  succeed.  A  function 
is  considered  inlineable  if  it  is  either  declared  inline  by  the  user  or  heuristically  inlineable 
and  not  declared  not-inline  by  the  user.  To  be  heuristically  inlineable,  a  function  has  to  be 
small — its  optimized  hcode  can  contain  no  more  than  two  full-fledged  function  or  method 
calls  and  no  more  than  twelve  built-in  calls.  A  point  system  is  used  to  determine  a  function’s 
“size;”  the  threshold  can  be  varied  by  adjusting  the  inline-size-cutoff  option. 

Furthermore,  to  prevent  object  thrashing  on  the  J-Machine,  a  function  is  'neuristically  unin- 
lineable  if  it  references  an  instance  variable  of  an  object  passed  as  its  first  argument  if  the 
caller  of  that  function  does  not  pass  its  first  argument  through  as  the  first  argument  of  the 
function.  To  see  an  example  of  this  rule,  consider  the  function  sum4  in: 

(defclass  pair  ()  car  cdr) 

(defun  sum4  (p;pair  q:pair)  (+  (car  p)  (car  q)  (cdr  p)  (cdr  q) ) 

The  car  and  cdr  accessor  methods  are  well  under  the  size  threshold.  However,  only  the 
(car  p)  and  (cdr  p)  calls  are  inlined — (car  q)  and  (cdr  q)  are  not  because  the  calling 
function  sum4  does  not  pass  its  first  argument  p  as  car’s  or  cdr’s  first  argument.  There  is 
no  problem  with  inlining  (car  p)  and  (cdr  p)  into  direct  accesses  of  p’s  instance  variables 
because  sum4  is  executed  on  the  same  node  on  which  p  resides.  However,  if  sum4  were  to 
reference  q’s  instance  variables  directly,  it  would  force  q  to  travel  to  the  same  node  on  which 
p  resides,  thrashing  q.  Instead,  sum4  calls  (car  q)  and  (cdr  q)  in  the  usual  manner,  and 


^Each  class  has  an  integer  that  specifies  how  easy  it  is  to  tost  an  orbitrary  object  for  membership  in  that  class.  If 
that  integer  is  zero,  doing  this  check  is  no  easier  than  doing  o  method  dispatch;  if  that  integer  is  a  high  positive 
value  such  as  six  or  seven,  this  test  can  be  done  in  one  or  two  assembly  language  instructions.  Built-in  classes  such 
as  boolean  or  nul  1  allow  easy  membership  chocking,  while  user-defined  classes  do  not. 

^Method  calls  cannot  be  inlined  unless  they  were  previously  converted  into  function  calls  by  the  Constant  Folder. 
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the  car  and  edr  methods  are  executed  on  q’s  node  and  return  their  results  to  3um4  running 
on  p’s  node. 

The  function  inliner  tries  to  avoid  forcing  objects  to  migrate  whenever  possible.  This  is  not 
necessarily  the  optimal  strategy — in  some  cases  it  might  be  better  to  migrate  an  object  to  a 
method  that  accesses  it  frequently — but  the  desirability  of  migrating  the  object  is  difficult  to 
determine  by  the  compiler  because  it  depends  on  the  frequency  of  the  object’s  use  by  other 
processes  in  the  system.  Thus,  the  simple  solution  of  minimizing  object  migration  was  taken; 
in  the  cases  outlined  above,  a  method  that  makes  numerous  distant  object  accesses  can  usu¬ 
ally  be  rewritten  as  several  communicating  methods  which  only  access  local  objects. 

Once  the  function  inliner  decides  whether  it  would  like  to  inline  a  function,  it  attempts  to  in¬ 
line  the  function’s  optimized  hcode.  Nevertheless,  it  might  still  encounter  difficulties  if  the 
inlined  function  performs  nontrivial  processing  after  it  returns  its  result.  For  example,  con¬ 
sider  the  functions  silly-add,  shelll,  and  shell2: 

(defun  silly-add  (x  y) 

(reply  (+  x  y) ) 

(prove-fermata-last-theorem) 

(exit) ) 

(defun  shelll  (x  y) 

(cset  ( (z  (silly-add  x  y) ) ) 

(+  z  5)  )) 

(defun  shell2  (x  y) 

(silly-add  (+  x  5)  y) ) 

If  the  function  inliner  were  to  inline  silly-add  in  shelll,  it  might  convert  a  terminating 
program  into  a  nonterminating  one  (assuming  prove-fermats-last-theorem  does  not 
terminate  in  any  reasonable  amount  of  time  in  this  example)  because  shelll  would  try  to 
execute  all  of  silly-add  before  continuing  with  the  addition  of  5  to  z.  Thus,  the  function 
inliner  should  not  inline  any  function  that  performs  nontrivial  processing  after  it  replies  to 
its  caller.  On  the  other  hand,  there  is  nothing  wrong  with  inlining  silly-add  in  shell2  as 
long  as  shell2  is  tail-forwarded  because  shell2  would  still  return  the  sum  to  its  caller  be¬ 
fore  trying  to  prove  Fermat’s  last  theorem.  Other  interesting  scenarios  with  callers  and 
callees  accessing  the  same  lock  are  also  possible. 

The  general  rule  for  determining  whether  it  is  safe  to  inline  a  function  is  as  follows;  inlining 
is  safe  unless  the  inlined  function  performs  nontrivial  processing  after  replying  to  the  caller 
all  return  values  that  the  caller  is  not  tail-forwarding.  It  does  not  matter  if  or  when  the  in- 
lined  function  replies  to  any  other  functions  in  whose  lexical  environment  it  might  be;  i.e. 
non-local  lexical  returns  by  the  inlined  function  are  fine  as  long  as  they  don’t  transfer  control 
to  the  caller^. 

After  copying  the  inlined  function,  the  function  inliner  implements  the  above  rule.  It  runs  a 
dataflow  analysis  on  the  continuation  local  variables  in  the  inlined  function  to  determine 
where  each  continuation  reference  can  return  its  value;  if  it  has  any  problems  with  perform¬ 
ing  this  analysis,  it  does  not  inline  the  function.  Next,  the  function  inliner  uses  the  dataflow 
problem  solver  again  to  verify  that  no  statement  that  returns  a  value  to  the  caller  is  followed 
by  any  statement  that  might  not  terminate. 

Once  all  of  these  conditions  are  satisfied,  the  function  inliner  splices  the  inlined  function’s 
code  and  local  variables  into  the  caller.  Then  it  introduces  move  statements  to  move  the 
caller’s  arguments  to  the  appropriate  locals  in  the  callee.  If  the  callee  wasn’t  non-strict,  each 
argument  is  touched  as  it  is  moved.  Also,  the  statements  returning  values  from  the  callee  to 
the  caller  are  modified  to  store  the  values  in  more  temporaries,  which  are  moved  to  their 
proper  destinations  after  the  spliced  callee’s  code.  Needless  to  say,  the  move  eliminator  will 


^However,  for  simplicity  Optimist  11  docs  not  inline  clo.surc  calls. 
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have  a  lot  of  work  cleaning  up  the  extra  moves  just  introduced,  but  they  are  necessary  to 
make  sure  that  functions  are  inlined  correctly  in  all  cases. 

To  make  sure  that  the  compiler  terminates,  it  does  only  one  pass  of  function  inlining  for  each 
lambda;  otherwise,  it  could  peel  invocations  of  recursive  functions  forever.  However,  the  sin¬ 
gle  pass  of  inlining  does  not  mean  that  functions  are  only  inlined  one  level  deep;  on  the  con¬ 
trary,  the  callees  are  themselves  fully  optimized  before  ^ing  considered  for  inlining,  and  in 
the  process  of  being  optimized  they  may  let  other  functions  be  inlined  into  them.  It  is  true, 
though,  that  the  treewalker’s  antirecursion  rules  prevent  a  function  from  being  inlined  into 
itself. 

Once  all  potential  functions  are  inlined.  Optimist  II  performs  another  pass  of  iterative  opti¬ 
mizations  to  clean  up  and  optimize  the  code  introduced  by  the  inlining  process. 

Cleanup  Transformations 

Just  one  final  cleanup  transformation  is  done  on  hcode.  The  preceding  optimizations  gener¬ 
ated  a  number  of  local  variables  in  the  lambda,  many  of  which  are  no  longer  used.  The  local 
eliminator  removes  all  unused  locals  and  renumbers  the  remaining  locals  to  fill  the  gaps. 
This  simple  transformation  has  no  effect  on  the  code  generated  by  the  compiler  because  Op¬ 
timist  II’s  third  phase  will  compact  the  locals  anyway.  The  local  eliminator  is  present  solely 
for  aesthetic  and  compilation  speed  reasons — ^hcode  is  less  readable  if  it  has  many  unused  lo¬ 
cal  variables.  Also,  since  variable  bitmaps  are  represented  as  integers,  the  dataflow  code 
runs  much  faster  if  no  more  than  about  thirty  variables  are  present  so  Lisp  can  use  fixnums 
instead  of  bignums. 

MDP-Specific  Transformations 

A  number  of  MDP-specific  transformations  have  to  be  done  on  hcode  before  MDP  assembly 
code  can  be  generated.  These  transformations  and  optimizations  are  sketched  below  and  are 
listed  in  the  Postoptimizer  file. 

Global  Expander 

The  global  expander  implements  global  variables  as  instances  of  the  global  class.  Each  ref¬ 
erence  to  a  global  variable  is  replaced  with  a  reference  to  a  global  instance  object’s 
global-value  slot'.  The  global  instance  object  itself  is  a  mutable  immediate  object;  its  ID 
and  initial  value  are  known  to  the  compiler,  so  the  instance  object  can  be  referenced  by  any 
lambda  without  having  to  access  another  global. 

Addressing  Mode  Flattener 

The  addressing  mode  flattener  flattens  nested  hcode  lvalue  and  rvalue  expressions*  because 
the  assembly  language  compiler  can  only  compile  one-level  expressions.  Whenever  the  ad¬ 
dressing  mode  flattener  finds  a  nested  lvalue  or  rvalue  expression,  it  unnests  it  and  precedes 
the  hcode  containing  it  with  other  hcodes  that  calculate  the  expression’s  components  and 
store  them  in  local  variables. 

Statement  Splitter 

The  statement  splitter  is  the  first  of  two  MDP  built-in  optimization  filters.  This  filer  con¬ 
verts  associative  built-ins  such  as  +  and  and  with  more  than  two  arguments  into  chains  of 
two-argument  built-in  calls,  removes  type-assertion  statements  which  are  no  longer  needed. 


'The  global  class  name  and  accessors  to  its  global -vai  ue  slot  arc  all  unde f’d  just  alter  they  are  created,  so  they 
cannot  be  referenced  by  user  Concurrent  Smalltalk  programs,  and  no  name  conflicts  can  result. 

*Sce  Tables  5-3  and  5-4. 
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and  expands  many  primitives  and  hcodes  such  as  cas,  make-closure  and  force  into  their 
components. 

Built-in  Optimizer 

The  second  MDP  built-in  optimization  filter  is  the  built-in  optimizer.  This  optimizer  reduces 
the  strength  of  some  built-in  operations  such  as  multiplication  and  division  by  converting 
them  into  logical  shifts  using  the  identities  in  Table  A~4. 

The  built-in  optimizer  is  followed  by  another  call  to  the  touch  eliminator,  which  is  able  to 
eliminate  more  touches  than  it  could  previously.  At  this  point  the  touch  eliminator  can  de¬ 
pend  on  built-ins  not  being  optimized  out,  so  it  can  remove  touches  of  values  which  are  sub¬ 
sequently  used  by  built-ins.  For  example,  if  a  touch  of  a  is  immediately  followed  by  an  appli¬ 
cation  of  -)■  to  a  and  b,  the  touch  can  be  eliminated;  it  could  not  have  been  eliminated  before 
because  the  +  might  have  been  eliminated  or  another  statement  inserted  between  the  touch 
and  +. 

Instance  Variable  Target  Transformer 

This  transformation  and  the  following  two  correct  quirks  in  the  MDP  and  Cosmos  architec¬ 
tures.  One  restriction  of  the  Cosmos  design  is  that  the  targets  of  full-fledged  applications  can 
only  be  local  variables  in  the  context;  applications  other  than  built-ins  cannot  store  their  re¬ 
sults  directly  into  instance  variables  or  into  locals  in  places  other  than  the  context.  The  in¬ 
stance  variable  target  transformer  scans  for  application  statements  that  store  their  results  in 
instance  variables  and  modifies  them  to  store  the  results  in  local  variables  and  then  move 
them  into  the  instance  variables. 

Grab  Introducer 

The  grab  introducer  generalizes  the  instance  object  access  mechanism  in  Optimist.  While 
Optimist  could  access  at  most  one  instance  object  in  a  lambda,  Optimist  II  can  access  many. 
Unfortunately,  there  is  only  one  MDP  address  register,  ID2,  assigned  to  holding  pointers  to 
instance  objects.  Hence,  before  every  statement  S  that  might  access  an  instance  object,  the 
grab  introducer  checks  the  value  of  ID2  left  from  the  previous  statement;  if  that  value  is  in¬ 
correct,  the  grab  introducer  inserts  a  grab  statement  just  before  S  to  put  the  right  object  into 
ID2.  If  s  accesses  many  instance  objects,  the  grab  introducer  inserts  moves  and  uses  other 
statement-specific  techniques  to  make  s  access  only  one  instance  object;  doing  this  well  can 
become  quite  an  involved  process  for  some  hcodes. 

The  grab  introducer  also  generalizes  the  instance  object  part  of  the  Context  Optimization 
transformation  found  in  Optimist — if  an  instance  object  is  not  referenced,  there  is  no  need  to 
point  ID2  to  it  and  possibly  force  it  to  migrate. 

Cfuture  Parameter  Eliminator 

The  cfuture  parameter  eliminator  complements  the  instance  variable  target  transformer  by 
eliminating  application  statements  that  store  their  results  back  in  a  lambda’s  parameters. 
Unlike  Optimist,  Optimist  II  allows  function  and  methods  to  use  their  parameters  just  like 
any  other  local  variables,  and,  in  particular,  write  into  them.  However,  the  operating  system 
does  not  support  cfutures  in  a  function’s  parameter  area.  Hence,  if  the  cfuture  parameter 
eliminator  finds  a  parameter  p  used  as  a  target  of  a  full-fledged  application,  it  creates  a  new 
local  variable  I,  emits  a  move  to  copy  p  into  I  upon  entry  to  the  function,  and  substitutes  I  for 
every  use  of  p  in  the  function. 

Enter/Exit  Introducer 

The  last  two  filters  are  another  call  to  the  local  eliminator  and  the  introduction  of  enter  and 
exit  hcodes  at  the  beginning  and  end  of  the  lambda,  respectively.  The  compiler  will  compile 
these  hcodes  to  the  entry  and  cleanup  code  for  the  lambda. 
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3.4.  Code  Generation 

The  third  phase  of  Optimist  II  contains  the  hcode  compiler,  assembly  optimizer,  and  assem¬ 
bler.  The  hcode  compiler  compiles  hcode  into  an  assembly  language  module,  which  is  a  di¬ 
graph  of  assembly  language  statements.  The  assembler  and  the  assembly  optimizer  then  in¬ 
sert  branches  into  the  module  and  perform  peep-hole  optimizations  on  it.  Since  the  hcode 
compiler,  assembly  optimizer,  and  assembler  were  all  present  in  Optimist,  only  the  differ¬ 
ences  will  be  described  here. 

New  Hcode  Compiler  Features 

The  hcode  compiler  has  been  updated  for  CLOS,  the  new  Concurrent  Smalltalk,  and  the  new 
Architecture  version  IIB.  Major  Concurrent  Smalltalk  changes  affecting  the  compiler  in¬ 
clude  introduction  of  multiple  values  to  application  statements  and  the  introduction  of  many 
built-ins  which  compile  into  MDP  system  calls  or  sequences  of  MDP  instructions.  Built-ins 
for  even  such  low-level  facilities  such  as  reading  or  checking  tags  were  provided,  and  are  ac¬ 
cessed  by  the  Concurrent  Smalltalk  runtime  system. 

The  context  and  variable  allocation  schemes  have  changed  somewhat.  Optimist’s  graph-col- 
orer  for  allocating  context  local  variables  worked  well  and  has  been  extended  to  also  allow 
slots  in  the  message  to  be  reused  as  local  variables;  thus  the  slots  in  the  incoming  message 
and  the  slots  in  the  context  form  a  pool  of  slots  to  which  the  compiler  can  allocate  local  vari¬ 
ables  at  will.  The  only  restriction  imposed  by  Cosmos  is  that  local  variables  which  might 
contain  cfutures  cannot  be  assigned  to  incoming  message  slots. 

Unlike  JOSS,  Cosmos  fixes  the  locations  of  the  saved  registers  in  the  context.  If  a  function 
would  need  more  slots  than  the  fourteen  provided  in  a  standard  context,  Optimist  II  assigns 
the  extra  locals  to  slots  after  the  saved  register  area  in  the  context*,  up  to  a  limit  of  53  total 
slots;  the  MDP  cannot  readily  address  more  than  64  words  in  an  object,  and  11  are  used  for 
overhead.  If  a  large  context  is  needed.  Optimist  II  emits  code  to  create  it  when  the  function 
starts  execution  and  dispose  it  when  it  is  done. 

One  architectural  change  had  considerable  impact  on  all  stages  in  the  third  phase.  Architec¬ 
ture  IIB  allows  long  immediate  constants  and  long  displacement  into  objects  on  most  two- 
operand  instructions  but  not  three-operand  ones.  Optimist  II  takes  advantage  of  these  oper¬ 
ations  whenever  possible,  but  handling  the  worst  case  possibilities  is  now  more  complicated. 
For  example,  it  is  no  longer  true  that  an  add  instruction  allows  the  same  addressing  modes 
as  a  NEG. 

New  Assembler  Features 

The  assembler  has  been  upgraded  to  output  many  kinds  of  objects  instead  of  just  code.  When 
it  encounters  the  use  of  a  pointer  to  an  object  inside  another  object,  it  outputs  an  MDPSim 
reference  to  the  pointed  object.  MDPSim  resolves  all  of  these  references  when  it  downloads 
the  objects  to  its  simulated  J-Machine. 

Global  Compilation 

Unlike  Optimist,  which  compiled  isolated  modules.  Optimist  II  compiles  entire  programs. 
Hence,  it  has  the  additional  duty  of  emitting  the  “glue”  that  holds  programs  together.  In  par¬ 
ticular,  it  emits  class  definitions,  method  tables,  data  objects,  and  code  objects.  It  emits  all 
class  definitions  first  because  they  are  needed  to  load  other  objects.  The  order  of  the  other 


*See  Figure  4-9. 
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objects  does  not  matter  because  MDPSim  can  resolve  references  in  any  order,  .^ter  emitting 
objects,  Optimist  II  emits  code  that  automatically  downloads  the  objects  into  the  J-Machine. 

Identifiers 

Since  MDPSim  currently  allows  only  alphanumeric  characters  and  underscores  in  its  identi¬ 
fiers,  Optimist  II  converts  any  identifier  characters  outside  that  set  into  strings  of  characters 
in  that  set.  Next,  Optimist  II  prepends  the  kind  of  identifier  to  each  identifier  it  emits.  The 
kinds  are  listed  in  Table  3-8.  l^nally.  Optimist  II  checks  whether  another  identifier  with  the 
same  name  has  been  emitted.  If  so,  and  if  the  other  identifier  is  not  eq  to  the  current  one. 
Optimist  II  disambiguates  the  current  identifier  by  appending  two  underscores  and  a  num¬ 
ber  to  it.  This  transformation  is  necessary  because  sometimes  many  anonymous  functions 
are  generated. 

Table  3-8.  Identifier  Prefixes 


Kind 

Prefix 

Class 

c 

Selector 

sel 

Symbol 

sym 

Function 

f 

Other  Object 

0 

IDs 

To  allow  downloading  of  circular  data  structures.  Optimist  II  assigns  IDs  to  all  objects  it 
emits.  In  order  to  do  this  assignment,  it  has  to  know  how  many  nodes  there  are  in  the  J-Ma¬ 
chine  for  which  it  is  compiling.  This  number  is  provided  in  the  n-nodes  Optimist  II  option. 

Optimist  II  uses  increasing  positive  integers  to  generate  serial  numbers  for  classes,  selectors, 
and  symbols.  Functior  s  and  other  objects  are  assigned  IDs  starting  with  the  serial  number 
$7FFF  and  decreasing  to  avoid  conflicts  with  serial  numbers  generated  by  Cosmos,  which 
start  at  $0000  and  increase.  Optimist  II  tries  to  distribute  the  objects  it  creates  evenly 
throughout  the  MDPs  in  the  J-Machine. 

Method  Tables 

The  Optimist  IPs  assembler  generates  a  method  table  which  associate  methods  with  classes 
and  selectors.  The  method  table  is  distributed  among  the  class  and  selector  objects  loaded 
into  MDPSim.  Each  selector  (Figure  4-17)  contains  a  list  of  class/method  pairs  that  describe 
all  methods  defined  for  that  selector.  In  addition,  each  class  object  (Figure  4-13)  contains  an 
ordered  list  of  the  class’s  ancestors. 

Together,  the  two  objects  contain  enough  information  to  deduce  the  method  associated  with  a 
class  and  selector;  first  the  class  is  looked  up  in  the  selector’s  list  of  class/method  pairs;  if  it 
is  not  found,  the  ancestor  classes  are  looked  up,  one  by  one,  in  that  list.  Either  a  binding  is 
found,  in  which  case  the  binding  contains  the  desired  method,  or  no  binding  is  found,  in 
which  case  there  is  no  method  associated  with  the  given  class  and  selector. 

Data  Formats 

The  formats  of  built-in  objects  emitted  by  Optimist  II  are  described  in  more  detail  in  the  next 
chapter.  Primitive  objects  are  listed  in  Figure  4-2,  instance  objects  of  user-defined  classes 
are  shown  in  Figure  4-4,  and  functions  are  shown  in  Figure  4-20  and  closures  in  Figure  4-22. 
Optimist  II  cannot  emit  immediate  distributed  objects,  but  they  can  be  created  at  runtime. 
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3.5.  Conclusion 

The  main  goal  of  writing  Optimist  II  was  to  bridge  the  gap  between  Concurrent  Smalltalk 
and  the  J-Machine.  Optimist  II  is  the  first  compiler  that  can  compile  a  Concurrent  Smalltalk 
program  into  code  that  can  be  run  on  a  J-Machine  without  any  changes.  Unlike  Optimist, 
which  compiled  only  modules,  Optimist  II  compiles  entire  programs,  including  the  class  hier¬ 
archy,  method  tables,  functions,  and  immediate  objects.  Furthermore,  Optimist  II  supports  a 
much  larger  subset  of  Concurrent  Smalltalk  than  Optimist.  Optimist  II  supports  the  entire 
language  except  for  full  futures  and  I/O  facilities. 

Observations 

The  global  optimizations  included  in  Optimist  II  are  very  useful,  as  they  free  programmers 
from  having  to  break  abstraction  barriers  in  order  to  achieve  reasonable  performance.  This 
consideration  alone  was  a  great  help  in  writing  the  runtime  system.  Many  of  the  built-in 
functions  such  as  zero?  and  instance  variable  accessors  are  candidates  for  inlining,  and,  in 
fact,  they  are  often  inlined  into  user  programs.  Without  global  optimizations  writing  an  effi¬ 
cient  runtime  system  would  have  been  difficult  and  error-prone.  Zero?  could  perhaps  have 
been  implemented  as  a  macro,  but  then  it  would  not  be  possible  for  a  user  program  to  over¬ 
ride  it  for  its  own  classes.  Moreover,  zero?  would  then  suffer  from  the  classic  Lisp  problem 
of  a  macro  not  being  a  first-class  data  object  and  interchangeable  with  functions.  Also,  inlin¬ 
ing  of  functions  may  be  controlled  by  fairly  sophisticated  heuristics,  while  macros  would  al¬ 
ways  be  expanded. 

The  substitution  of  function  calls  for  method  calls  is  also  a  useful  optimization.  In  simple 
programs  almost  all  method  calls  are  replaced  with  function  calls  and  then  often  inlined.  In 
fact,  in  all  the  simple  and  non-contrived  examples  I  have  compiled,  Optimist  II  was  able  to 
remove  all  method  dispatches  and  replace  them  with  function  calls.  Even  in  applications  us¬ 
ing  Lisp-style  lists,  there  are  usually  at  most  two  methods  defined  on  an  object — one  method 
handles  the  nil  case,  while  the  other  handles  the  nontrivial  case — and  Optimist  II  turns  the 
method  call  into  a  comparison  of  the  argument  against  nil  followed  by  one  of  two  function 
calls,  often  inlined. 

Generality  or  Simplicity? 

One  recurring  issue  was  whether  Optimist  II  should  be  a  compiler  for  a  general  target  or  a 
compiler  specifically  tailored  to  the  J-Machine.  Ideally,  Optimist  II  should  have  a  back  end 
that  could  be  replaced  to  compile  code  for  a  different  architecture.  Unfortunately,  this  ideal 
was  not  achieved.  Although  many  MDP-specific  transformations  are  collected  near  the  end 
of  the  Optimizer,  some,  such  as  the  continuation  expander  had  to  be  placed  earlier  in  the 
compilation  process.  Worse,  much  of  the  runtime  system  at  the  very  front  of  the  compiler  is 
heavily  dependent  on  the  MDP  architecture. 

The  two  issues  at  odds  here  are  generality  and  simplicity.  Due  to  the  limited  scope  and  ex¬ 
perimental  nature  of  this  project,  I  resolved  conflicts  in  favor  of  simplicity.  For  example. 
Concurrent  Smalltalk  is  a  useful  systems  programming  language,  and  it  was  desirable  to 
implement  some  features  of  Concurrent  Smalltalk  in  Concurrent  Smalltalk.  While  this  ap¬ 
proach  would  make  the  Optimist  II  front  end  nonportable,  I  decided  to  use  this  approach 
anyway  because  it  made  the  runtime  system  simple  to  write,  understand,  and  modify. 

Future  Plans 

Optimist  II  is  still  an  evolving  compiler,  and  it  will  surely  change  in  the  future.  In  addition 
to  implementing  the  remaining  language  features  and  fixing  bugs.  Optimist  II  could  be  ex¬ 
tended  to  implement  inline  objects  and  the  load  balancing  ideas  discussed  in  Chapter  8.  In 
addition,  a  number  of  minor  tweaks  mentioned  in  [21]  are  still  possible.  Now  that  branches 
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have  a  longer  range,  Optimist  II  could  be  more  liberal  with  the  use  of  MDP  register  RO  to 
hold  values  between  statements’.  A  smarter  register  allocator  could  assign  a  variable  to  a 
register  for  part  of  its  lifetime.  The  peephole  optimizer  could  replace  branches  to  SUSPEND 
instructions  with  SUSPEND  instructions  themselves.  The  implementation  of  closures  could  be 
made  faster.  The  compiler  could  automatically  detect  side-effect-free  and  no-leak  functions; 
this  information  might  permit  it  to  explicitly  deallocate  some  objects  such  as  closures  if  it 
could  prove  that  they  could  not  be  referenced  again.  Overall,  though,  it  seems  that,  except 
for  loops  which  are  deliberately  broken  to  avoid  hogging  processors,  no  more  than  a  few  per¬ 
cent  more  performance  can  be  squeezed  out  of  the  code  generated  by  Optimist  II;  however, 
since  the  operating  system  overhead  time  overwhelms  the  execution  time  in  Concurrent 
Smalltalk  methods,  there  might  be  room  for  improvement  through  coordinated  compiler  and 
operating  system  changes. 


’In  Architecture  10,  all  but  the  shortest  bronchos  required  the  value  of  RO  to  be  altered,  rendering  that  register 
practically  useless  for  holding  values  between  statements. 
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Design  Goals 

The  Cosmos  operating  system  was  designed  primarily  as  a  support  kernel  for  running  Con¬ 
current  Smalltalk  programs  on  the  J-Machine.  Nevertheless,  Cosmos  is  not  specialized  to 
Concurrent  Smalltalk,  and  many  of  the  operating  system’s  components  could  be  used  to  sup¬ 
port  a  general  message-passing  environment. 

The  goals  in  designing  the  operating  system  were,  in  order: 

1.  To  make  a  working  operating  system. 

2.  To  make  the  operating  system  as  efficient  as  possible. 

3.  To  make  the  operating  system  as  simple  and  flexible  as  possible. 

The  design  of  the  operating  system  also  had  to  be  small  enough  to  allow  both  it  and  most  of 
the  Optimist  II  compiler  to  be  written  in  one  semester;  for  this  reason  garbage  collection  and 
load  management  facilities  were  not  included  in  the  operating  system.  Several  steps  were 
taken  to  achieve  goal  (1),  including  the  criticality  system  and  the  debugging  techniques  de¬ 
scribed  later.  The  criticality  system  is  an  organized  accounting  method  used  to  ensure  that 
no  re-entrancy  problems  occur  when  operating  system  routines  call  each  other.  Features 
were  added  to  MDPSim  to  detect  and  signal  race  conditions  known  as  hazards.  To  achieve 
goal  (2),  the  entire  operating  system  kernel  was  written  in  hand-optimized  assembly  lan¬ 
guage.  Poor  J-Machine  performance  can  no  longer  be  blamed  solely  on  the  operating  system. 
Goal  (3)  was  achieved  by  providing  general  data  structures  that  are  reused  in  many  compo¬ 
nents  of  the  system. 

Functionality 


The  operating  system  assists  Concurrent  Smalltalk  programs  by  providing  the  following  ser¬ 
vices: 

•  Initialization  and  setup  of  the  J-Machine. 

•  Providing  fault  handlers  for  faults  needed  to  keep  the  J-Machine  running. 

•  Global  function  calls  and  returns. 

•  Looking  up  methods  corresponding  to  class/selector  or  object/selector  pairs. 

•  Context  allocation  and  deallocation  facilities  and  conventions. 

•  Local  and  global  object  allocation,  deallocation,  lookup,  and  migration  facilities.  Mutable 
objects  exist  on  only  one  node  at  a  time,  while  immutable  objects  can  exist  on  many  nodes  at 
a  time;  all  but  the  primary  copy  can  be  purged  when  extra  memory  is  needed. 

•  Support  for  distributed  objects  as  defined  in  Concurrent  Smalltalk. 

•  Support  for  Concurrent  Smalltalk  primitives  such  as  determining  the  type  of  an  object. 

•  Calls  assisting  in  the  creation  and  evaluation  of  closures. 

•  An  integer  division  routine. 

•  Debugging  and  consistency-checking  facilities. 


51 


Concurrent  Smalltalk  on  the  Message-Driven  Processor 


Figure  4-1.  Operating  System  Organization 

The  arrows  represent  calling  patterns  in  the  Cosmos  operating  system.  Every  module  uses  the  fault  handlers; 
those  dependencies  were  omitted  for  clarity.  The  modules  in  bold  boxes  are  roots — they  are  invoked  by  the  user. 

The  modules  in  the  top  section  of  the  figure  are  written  in  Concurrent  Smalltalk;  however,  the  CST  Runtime  mod¬ 
ule  may  not  necessarily  be  portable  to  other  Concurrent  Smalltalk  implementations  because  it  references  some 
MDP  data  structures.  The  modules  in  the  middle  section  are  written  in  MDP  assembly  code  because  they  imple¬ 
ment  functionality  that  cannot  be  easily  expressed  in  Concurrent  Smalltalk.  From  the  point  of  view  of  the  rest  of 
the  operating  system,  though,  these  modules  are  indistinguishable  from  compiled  Concurrent  Smalltalk  code.  The 
modules  in  the  bottom  section  are  fixed  in  the  memory  of  every  MDP  either  because  they  are  critical  to  the  MDP's 
operation  or  because  calling  them  as  functions  would  be  inefficient. 

After  Cosmos  initializes  the  J-Machine,  a  Concurrent  Smalltalk  program  can  be  loaded  using 
Cosmos’s  downloading  facilities.  Once  the  program  is  loaded,  a  single  call  to  Cosmos’s  Apply 
handler  can  start  the  execution  of  one  function  in  the  program.  Whenever  a  function  needs 


52 


Chapter  4 


The  Cosmos  Operating  System 


to  invoke  another  function  or  method,  it  first  calls  the  Cosmos  Ob  jectNode  routine’  to  de¬ 
termine  a  good  node  for  that  invocation  and  then  sends  an  Apply  message  or  one  of  its  vari¬ 
ants  to  that  node.  The  target  node,  upon  receiving  that  message,  executes  the  Cosmos  Apply 
handler  that  fetches  the  function  or  method  code  and  calls  it. 

Many  functions  need  to  store  local  state  in  memory,  either  because  they  need  more  variables 
than  will  fit  in  the  MDFs  registers  or  because  they  make  function  or  method  calls  and  need  a 
place  in  which  to  save  state  for  the  duration  of  the  call.  Cosmos  uses  contexts  to  save  state 
and  provides  routines  to  allocate  and  deallocate  them. 

In  addition.  Cosmos  manages  objects  globally,  migrating  objects  and  code  to  the  nodes  that 
need  them.  Cosmos  keeps  only  one  instance  of  immutable  objects,  but  it  can  make  copies  of 
immutable  objects  and  code.  Also,  Cosmos  provides  routines  to  determine  the  type  of  an  ob¬ 
ject  and  to  create  and  address  distributed  objects.  Finally,  Cosmos  provides  primitives  such 
as  division  that  would  be  hard  to  implement  in  Concurrent  Smalltalk. 

Structure 

The  operating  system  is  composed  of  interacting  modules  shown  in  Figure  4-1.  The  high- 
level  modules  are  built  in  layers  out  of  lower-level  ones;  however,  the  low-level  modules  are 
deeply  interrelated  because  of  the  hardware  restrictions  of  the  MDP.  Furthermore,  due  to 
efficiency  considerations  and  hardware  restrictions  on  faulting,  much  of  the  code  in  some  of 
the  managers  is  inlined  inside  other  managers.  This  is  especially  common  at  the  lowest 
levels  such  as  the  heap  and  context  managers. 

Reading  Guide 

This  chapter  describes  the  handlers  in  the  two  lower  sections  of  Figure  4-1;  the  Concurrent 
Smalltalk  code  is  described  in  Chapter  3.  After  a  brief  overview,  the  handlers  will  be  de¬ 
scribed  in  this  chapter  from  the  bottom  level  up. 

Heap  Manager 

The  heap  manager  manages  the  heap  on  each  MDP.  The  heap  allows  allocation,  dealloca¬ 
tion,  and  purging  of  arbitrary  objects  in  the  local  memory  on  the  MDP.  All  object  references 
are  bounds-checked,  and  primitive  compaction  facilities  are  provided. 

BRAT  Manager 

The  BRAT  manager  keeps  track  of  the  BRAT — Birth/Residence  Address  Table  [38].  The 
BRAT  is  an  associative  table  used  mainly  for  translating  virtual  addresses  to  physical  ad¬ 
dresses,  although  it  is  also  used  for  some  housekeeping  tasks  in  object  migration. 

Object  Manager 

The  object  manager  combines  the  facilities  of  the  heap  manager  and  the  BRAT  manager  to 
provide  a  virtual  name  space  for  the  objects  allocated  by  the  heap  manager.  The  object  man¬ 
ager  is  capable  of  allocating  objects  on  the  local  node  and  giving  them  unique  names.  It  can 
also  determine  that  an  object  does  not  reside  on  the  local  node,  but  it  cannot  access  nonlocal 
objects. 

Context  Manager 

The  context  manager  keeps  track  of  contexts.  A  context  is  the  MDP  equivalent  of  an  invoca¬ 
tion  descriptor  on  a  conventional  computer.  The  context  contains  values  of  the  local  variables 


’Sometimes  that  call  is  optimized  by  Optimist  II  to  a  single  move  from  NNR  instruction. 
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of  a  process,  saved  data  and  ID  register  values,  and  the  instruction  pointer  (IP)  when  a  pro¬ 
cess  suspends. 

Global  Object  Manager 

The  global  object  manager  is  an  extension  of  the  object  manager  to  the  global  virtual  address 
space  of  the  J-Machine.  The  global  object  manager  can  access  nonlocal  objects,  and  it  can 
migrate  objects  between  nodes.  It  can  distinguish  mutable  objects  from  immutable  ones  and 
maintain  copies  of  the  latter  on  many  nodes. 

The  global  object  manager  also  can  determine  the  class  of  an  arbitrary  object,  and  it  is  the 
lowest  level  in  the  operating  system  that  implements  distributed  objects. 

Method  Manager 

The  method  manager  implements  an  association  between  classes,  selectors,  and  methods  on 
top  of  the  global  object  manager.  The  method  manager  can,  given  a  class  and  a  selector, 
quickly  determine  the  appropriate  method  that  represents  applying  the  selector  to  an  object 
of  that  class. 

Control  Manager 

Function  and  method  calls  and  replies  are  dispatched  by  the  control  manager  Every  func¬ 
tion  or  method  call  is  actually  a  message  send  to  an  entry  point  in  the  control  manager, 
which  interprets  the  incoming  message,  makes  sure  it  ’ .  valid,  fetches  the  called  code,  and 
runs  it.  The  control  manager  also  handles  suspending  after  efuture  faults  and  resuming 
when  a  called  function  or  method  returns  a  value. 

Utilities 

The  operating  system  kernel  includes  commonly-used  utilities  that  would  suffer  too  much 
overhead  if  they  had  to  be  called  via  the  standard  function  call  mechanism.  The  current  util¬ 
ities  include  a  divide  system  call  and  calls  that  create  and  evaluate  closures. 

MDP  Runtime 

The  MDP  runtime  system  contains  other  utilities  that  have  to  be  coded  in  MDP  assembly 
language.  Currently  MDP  runtime  utilities  include  a  method  table  lookup  routine  and  func¬ 
tions  that  create  distributed  objects.  When  arrays  are  implemented,  they  will  also  be  imple¬ 
mented  as  MDP  runtime  utilities. 

CST  Rimtime 

The  CST  runtime  system  contains  utilities  which  could  be  coded  in  Concurrent  Smalltalk. 
These  utilities  implement  most  of  the  functions  and  macros  listed  in  the  Concurrent 
Smalltalk  reference  maitual  (Appendix  A),  including  locks,  some  array  code,  and  object-han¬ 
dling  functions  such  as  copiers  and  destructors,  as  well  as  lower-level  functionality  such  as 
global  variables. 

Data  R<  presentation 

Figure  4-2  shows  an  overview  of  the  representations  of  various  Concurrent  Smalltalk  objects. 
The  representations  of  the  complex  Concurrent  Smalltalk  object  such  as  functions,  selectors, 
and  classes  will  be  explained  in  more  detail  in  the  following  sections. 
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NIL 
Symbol 
Qass 
Selector 
Character 
FALSE 
TRUE 
Integer 
Float 
Future 
Standard  Object 
Dist.  Obj.  Constituent 
Dist.  Obj.  Group  Name 

Fi^re  4-2.  Concurrent  Smalltalk  Object  Representations 

Primitive  objects  are  represented  as  above  using  the  MDP's  32-bit  words  with  4-bit  tags.  Objects  not  shown 
above  are  represented  as  standard  objects  using  the  ID  tag.  Due  to  a  shortage  of  tags,  NIL,  symbols,  classes, 
selectors,  and  characters  share  the  same  MDP  tag,  TAGO  (also  known  as  SYM),  and  are  distinguished  by  the  up¬ 
per  four  bits  of  the  data  word.  One  MDP  tag,  TAGA,  has  been  retained  for  future  expansion 

With  the  current  bit  layouts.  Cosmos  is  limited  to  representing  268435456  symbols,  65536  classes,  65536  selec¬ 
tors,  65536  futures,  32678  objects  per  node,  and  32768  distributed  objects  in  the  entire  system.  The  last  three 
limitations  are  especially  severe  and  will  be  considered  in  Chapter  8. 


5  21087  65  1098754  10 


TAGO 

SYM 

0 

TAGO 

SYM 

Symbol  Number 

TAGO 

CLASS 

0 

Class  Number 

TAGO 

SEL 

0 

Selector  Number 

TAGO 

CHAR 

0  ASCII  Code 

BOOL 

0 

0 

BOOL 

0 

1 

INT 

Two's  Complement  Value 

FLOAT 

IEEE  Single-Precision  Float  Value 

FUT 

Serial  Number 

Home  Node  Number 

2  j  y  i  X 

ID 

0 

Serial  Number 

Home  Node  Number 
^  !  7  1  * 

ID 

1 

Serial  Number 

Home  Node  Number 
z  f  y  I  * 

DID 

1 

Serial  Number 

Lg(Stride)  Linear  Home  Node  Number 
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BRAT  Entry 


Figure  4-3.  MDP  Memory  Organization 

The  data  structures  above  are  replicated  on  every  MDP  in  the  J-Machine.  All  of  the  data  structures  except  the 
heap  reside  in  fast  RAM.  The  top  of  the  heap  resides  in  fast  RAM,  but  most  of  if  is  in  slow  RAM. 


Figure  4-3  maps  the  structures  addressable  in  the  physical  address  space  of  every  MDP.  The 
heap  occupies  most  of  memory  and  is  used  for  storing  and  keeping  track  of  Concurrent 
Smalltalk  objects  and  contexts.  The  BRAT  root  table  is  a  separate  hash  table  that  points  to 
the  BRAT  entries  in  the  heap.  The  XLATE  table  is  a  table  used  for  hardware-assisted  asso¬ 
ciative  lookups.  In  addition,  every  MDP  contains  a  copy  of  the  Cosmos  code  and  fault  vector 
assignments  and  a  small  set  of  globals  used  by  Cosmos  and  some  of  the  runtime  routines. 
Finally,  every  MDP  contains  two  hardware-managed  incoming  message  queues. 
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Priorities 

Each  MDP  provides  three  levels  of  execution  priority — background,  priority  0,  and  priority  1. 
The  network  allows  messages  to  be  sent  at  priority  0  or  1;  when  a  message  of  a  given  priority 
arrives  at  a  destination  node,  it  is  queued  in  the  appropriate  priority’s  queue.  The  queues 
are  constantly  monitored  by  the  CPU,  and  if  a  queue  contains  a  higher-priority  message  than 
the  task  currently  running,  the  current  task  is  pre-empted  to  handle  the  message. 

Cosmos  currently  only  uses  the  background  and  priority  0  levels.  It  is  anticipated  that  prior¬ 
ity  1  will  be  used  in  the  future  for  garbage  collection  and  resolving  emergencies  such  as 
queue  or  memory  overflow.  In  addition,  on  a  real  J-Machine  (as  opposed  to  MDPSim),  prior¬ 
ity  1  will  make  a  good  debugging  channel.  Cosmos’s  use  of  the  background  priority  is  cur¬ 
rently  limited  to  initialization;  it  would  be  nice  if  background  mode  could  be  used  for  incre¬ 
mental  heap  compaction,  but  that  may  be  difficult — ^because  of  flaws  in  the  MDP  architec¬ 
ture,  the  background  priority  and  priority  0  share  the  same  sets  of  globals,  ID  and  fault  reg¬ 
isters,  and  fault  vectors,  meaning  that  execution  of  a  priority  0  message  is  likely  to  clobber 
the  state  of  a  background  process. 
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Criticalities 

Cosmos  was  fairly  difficult  to  write  because  almost  all  of  its  routines  are  non-reentrant;  thus, 
locations  of  faults  inside  Cosmos  code  have  to  be  carefully  controlled.  The  MDP  does  not  in¬ 
clude  any  stacks,  which  means  that  each  routine  and  fault  handler  must  save  its  state  in  a 
different  set  of  global  variables.  Furthermore,  the  low-level  routines  have  to  be  veiy  careful 
not  to  alter  the  same  global  or  register  through  some  combination  of  system  calls  and  faults. 
Another  class  of  problems  consists  of  critical  sections  of  code  in  which  physical  addresses  are 
manipulated  in  data  registers  or  objects  are  referenced  assuming  they  are  present  in  the  lo¬ 
cal  memory.  No  heap  compaction  or  object  migration  is  allowed  in  those  sections.  If  a  heap 
compaction  or  object  migration  were  to  occur  in  such  a  section,  the  physical  address  or  object 
reference  would  become  invalid. 

To  make  these  problems  tractable  (but,  nevertheless,  still  difficult),  the  concept  of  a  critical¬ 
ity  was  introduced.  The  criticality  of  a  system  call  is  a  number  which  reflects  what  actions 
that  system  call  is  allowed  to  perform.  The  criticalities  are  listed  in  Table  4-1. 


A  routine  with  a  given  criticality  may  not  call  another  routine  with  a  lower  one.  For  exam¬ 
ple,  if  a  routine  is  sending  a  message,  it  may  not  make  a  system  call  or  allow  a  fault  of  criti¬ 
cality  less  than  4  while  it  is  sending  the  message.  Thus,  the  routine  has  to  force  any  poten¬ 
tial  cfutures  before  sending  the  message,  because  a  cfuture  fault  has  criticality  1.  If  a  routine 
stores  a  physical  address  of  a  heap  block  in  a  data  register,  it  must  have  criticality  at  least  5 
as  long  as  the  address  can  be  read  out  of  the  data  register.  If  a  routine  runs  with  the  MDPs 
fault  bit  set,  it  must  have  criticality  at  least  6  to  prevent  a  catastrophic  double  fault.  There 
will  be  no  re-entrancy  problems  as  long  as  each  routine’s  criticality  is  correct,  the  criticality 
rules  are  obeyed,  and  all  possible  faults  are  anticipated. 

Heap  Manager 

The  heap  manager  manages  the  heap  on  each  MDP,  allowing  allocation,  deallocation,  and 
purging  of  arbitrary  objects  in  the  local  memory  on  the  MDP.  The  heap  manager  does  not 
use  the  network,  so  most  of  its  routines  run  at  criticality  5. 

Heap  Structure 

The  heap,  shown  in  Figure  4-3,  is  organized  as  a  contiguous  block  of  memory.  Objects  are 
allocated  from  the  bottom  (lower  addresses)  up,  while  BRAT  entries  are  allocated  from  the 


Table  4*1.  Criticalities 
Value  Actions  Allowed 

0  All  actions  are  allowed.  Caller’s  registers  do  not  have  to  be  preserved. 

1  Caller’s  registers  must  be  preserved.  May  suspend,  so  MDP's  globals  are  not  pre¬ 
served. 

2  No  suspending  faults,  no  modification  of  context  state. 

3  No  suspending  faults,  no  modification  of  context  state,  no  object  migration. 

4  No  message  sends,  no  object  migration, 

5  No  heap  compaction,  no  message  sends. 

6  No  faults  or  system  calls,  no  heap  compaction,  no  message  sends. 

7  No  priority  1  interrupts,  no  faults  or  system  calls,  no  heap  compaction,  no  message 
sends. 
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Figure  4-4.  A  Heap  Block 

Each  MDP  heap  block  consists  of  a  header  and  ID  words  followed  by  user-defined  data. 


top  down  by  the  BRAT  manager.  The  objects  in  the  heap  between  FixedHeapStart  and 
HeapStart  are  nonrelocatable — once  allocated,  they  are  never  moved.  Currently  that  area 
is  used  for  storing  a  few  fast  contexts.  The  rest  of  the  heap  is  dynamically  divided  between 
relocatable  objects  and  BRAT  entries.  The  FirstFree  pointer  points  to  the  first  unused 
word  of  heap  memory,  while  the  LastFree  pointer  points  to  the  first  word  used  for  BRAT  en¬ 
tries. 

Heap  Blocks 

Each  heap  block  has  the  structure  shown  in  Figure  4-4.  The  presence  of  the  length  of  the 
block  in  the  first  word  and  its  virtual  ID  in  the  second  word  allows  the  heap  to  be  scanned 
and  compacted  quickly. 

The  heap  manager  uses  only  the  free,  purgeable,  and  marked  flags,  which  have  the  following 
meanings: 

•  Free.  The  heap  manager  will  reclaim  storage  from  those  blocks  when  it  needs  extra 
memory. 

•  Purgeable.  The  heap  manager  can  purge  those  blocks  when  it  needs  extra  memory. 

•  Marked.  A  purgeable  block  is  marked  if  it  has  not  been  accessed  for  a  while.  It  will  be 
purged  at  the  next  opportunity. 

The  copyable  and  locked  flags  are  managed  by  the  global  object  manager,  while  the  context 
manager  uses  the  fast  context  flag  to  distinguish  fast  contexts  from  standard  ones. 
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Object  Allocation 

Allocating  an  object  on  the  heap  is  usually  quite  fast,  taking  about  twenty  instructions. 
Given  the  object  ID  and  header  word,  the  AllocObject  heap  manager  routine  checks 
whether  there  is  enough  room  in  the  heap  for  the  object*.  If  so,  it  creates  and  returns  a  relo¬ 
catable  ADDR-tagged  word  pointing  to  the  physical  memory  that  will  be  occupied  by  the  ob¬ 
ject,  after  initializing  the  object’s  first  two  words  and  advancing  the  FirstFree  pointer.  If 
there  is  not  enough  free  memory,  AllocObject  calls  the  heap  compactor  to  try  to  free 
enough  memory  for  the  object. 

Heap  Compaction 

The  heap  compactor  is  called  whenever  a  memory  request  cannot  be  satisfied.  First  it  invali¬ 
dates  all  relocatable  addresses  cached  in  the  address  registers  and  the  XLATE  table^.  Then 
it  scans  through  the  heap  starting  from  HeapStart,  moving  each  block  as  far  to  the  front  of 
the  heap  as  possible.  As  each  block  is  moved,  its  physical  address  is  updated  in  the  BRAT, 
but  not  the  XLATE  table^.  Deleted  blocks  are  not  copied,  nor  are  marked  purgeable  blocks*. 
If  a  purgeable  block  was  unmarked,  it  is  copied  and  then  marked.  The  next  time  the  block  is 
referenced,  that  block’s  marked  bit  will  be  cleared  by  the  XLATE  fault  handler. 

A  heap  compaction  increases  the  amount  of  contiguous  available  memory  between 
FirstFree  and  LastFree.  However,  if  the  compaction  did  not  free  enough  memory  to  sat¬ 
isfy  the  allocation  request,  another  compaction  is  immediately  done.  The  second  compaction 
purges  the  remaining  purgeable  blocks  from  the  heap.  If  the  second  compaction  does  not  free 
enough  memory,  the  system  halts. 

Utility  Routines 

The  heap  manager  contains  a  couple  of  general-purpose  utility  routines  which  illustrate  cre¬ 
ative  use  of  the  MDFs  fault  mechanism.  One,  BlockMove,  quickly  moves  a  block  of  memory 
from  one  address  to  another.  The  routine  uses  straightline  code  followed  by  an  infinite  loop 
to  copy  data.  The  loop  is  terminated  by  a  LIMIT  fault  when  a  copy  is  attempted  of  the  first 
word  out  of  bounds  of  the  source  block.  Similarly,  BlockSend  quickly  sends  words  of  an  ob¬ 
ject  until  terminated  by  a  LIMIT  fault.  Without  using  LIMIT  faults  these  routines  would  be 
two  to  four  times  slower. 

BRAT  Manager 

The  BRAT  manager  maintains  the  BRAT — Birth/Residence  Address  Table  [38]  and  the 
XLATE  table.  The  BRAT  is  a  general-purpose  associative  table  used  mainly  for  translating 
virtual  addresses  to  physical  addresses.  The  XLATE  table  is  used  mostly  as  a  cache  for  the 
BRAT  table.  Table  4-2  lists  the  associations  currently  maintained  by  the  BRAT  manager. 
Like  the  heap  manager,  the  BRAT  manager  does  not  use  the  network  and  runs  mostly  at 
criticality  5. 

The  format  of  the  XLATE  table  is  dictated  by  the  MDP  hardware.  The  table  is  a  two-way  set- 
associative  cache  whose  location  and  position  are  specified  by  the  MDP  TBM  register.  Each 


'Actually,  AllocObject  makes  sure  that  there  are  three  more  free  words  in  the  heap  than  necessary  to  hold  the 
object  in  case  a  BRAT  entry  will  also  be  allocated  for  the  object.  This  avoids  the  difficult  situation  of  being  able  to 
allocate  a  heap  object  but  not  its  BRAT  entry;  a  heap  compaction  in  the  BRAT  manager  would  violate  criticality 
rules. 

^Just  re-entering  each  association  between  a  virtual  ID  and  the  new  physical  address  would  not  work  because  sev¬ 
eral  virtual  IDs  may  alias  to  the  same  physical  object;  the  copying  code  would  find  only  one  such  association  in  the 
XLATE  table. 

^Physical  addresses  are  not  updated  in  the  XLATE  table  because  if  they  were,  there  would  bo  no  easy  way  of  de¬ 
termining  which  blocks  were  referenced  between  heap  compactions.  The  XLATE  fault  handler  clears  the  marked  bit 
of  every  block  it  encounters  without  a  binding  in  the  XLATE  table. 

^Nevertheless,  if  an  object’s  locked  flag  is  set,  the  object  is  preserved,  even  if  it  is  also  indicated  as  deleted  or 
purgeable  and  marked.  This  action  is  required  to  maintain  consistency  in  the  global  object  manager. 
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binding  in  the  XLATE  table  consists  of  a  key  word  and  a  data  word.  Invalid  bindings  have  a 
NIL  data  word.  The  XLATE  and  PROBE  instructions  hash  the  key  they  receive  into  Ae 
XLATE  table  and  check  the  two  possible  bindings  whether  they  contain  the  right  key;  if  so, 
the  corresponding  data  word  is  returned.  The  ENTER  instruction  enters  a  new  binding  into 
the  XLATE  table;  that  binding  might  overwrite  an  existing  binding  of  a  different  key,  so  the 
XLATE  table  is  only  a  cache — bindings  are  not  guaranteed  to  remain  in  the  table.  The  hash 
function  used  is  the  exclusive-or  of  the  four  bytes  that  constitute  the  data  portion  of  the  key 
word;  the  tag  of  the  key  word  does  not  participate  in  the  hashing.  Thus,  the  XLATE  table  is 
limited  by  hardware  to  512  bindings,  which  may  not  be  enough  if  there  are  many  small  ob¬ 
jects  on  a  node. 

Table  4-2.  XLATE  and  BRAT  Associations 


“Virtual’’ Tag  “PhysIcarTag  Tables  Association 


ID 

ADDR 

XLATE,  BRAT 

Physical  object  location 

ID 

INT 

BRAT 

Node  number  of  node  containing  object 

ID 

context  ID 

BRAT 

Context  waiting  for  object 

DID 

ADDR 

XLATE 

Physical  location  of  nearest  constituent 

TAG0;SEL 

ADDR 

XLATE.  BRAT 

Physical  location  of  selector  object 

TAGO:CLASS 

ADDR 

XLATE.  BRAT 

Physical  location  of  class  object 

TAGO:CLASS 

INT 

BRAT 

Node  number  of  node  containing  class  object 

TAGOiCLASS 

context  ID 

BRAT 

Context  waiting  for  class  object 

TAG0:SYM 

none 

XLATE 

Symbols  are  primitive  objects 

TAGOiCHAR 

none 

XLATE 

Characters  are  primitive  objects 

INT 

none 

XLATE 

Integers  are  primitive  objects 

BOOL 

none 

XLATE 

Booleans  are  primitive  objects 

FLOAT 

none 

XLATE 

Floating  point  numbers  are  primitive  objects 

CS(INSTI) 

ID 

XLATE 

Class/selector  lookup 

The  above  table  contains  the  current  associations 

kept  in  the  virtual  tables.  A  general  object  (tagged  ID  or 

TAGO:CLASS)  can  associate  either  to  a  physical  address,  the  node  number  of  the  node  thought  to  contain  the 
object,  or  a  context  waiting  tor  the  object.  In  the  last  case,  it  the  object  is  being  accessed,  the  current  process 
suspends  and  puts  itself  onto  the  list  of  contexts  waiting  for  the  object  Selector  objects  are  just  like  general  ob¬ 
jects  except  that  they  do  not  migrate.  The  DID->ADDR  association  is  used  for  quickly  getting  to  constituents  of 
distributed  objects  from  the  group  10.  The  results  of  the  DID->ADDR  must  be  consistent  through  time — looking  up 
a  DID  on  the  same  node  must  always  yield  the  same  constituent.  Looking  up  a  primitive  object  other  than  the 
ones  just  mentioned  in  the  XLATE  table  must  always  miss.  Finally,  due  to  a  shortage  of  virtual  tags,  words  tagged 
INST1  are  used  as  class/selector  keys  to  the  method  manager's  method  cache. 


XLATE  and  BRAT  Table  Formats 

Unlike  the  XLATE  cache,  entries  in  the  BRAT  table  are  guaranteed  to  remain  in  the  table 
until  they  are  deleted.  As  shown  in  Figure  4-7,  the  BRAT  table  is  rooted  by  a  small  root  hash 
table.  Each  entry  in  the  root  table  points  to  a  linked  list  of  BRAT  bindings  with  keys  that 
hash  to  the  same  value.  In  addition,  there  is  a  linked  list  of  free  BRAT  entries.  There  are 
several  advantages  to  keeping  the  BRAT  table  organized  this  way  instead  of  the  flat  hash 
table  in  [38]: 

•  Deleting  entries  from  the  BRAT  is  easy,  while  at  the  same  time  searching  the  BRAT  for  a 
missing  key  is  fast.  Such  searches  are  common  because  they  occur  almost  evei^  time  an  ob¬ 
ject  not  present  in  local  memory  is  referenced. 

•  The  boundary  between  BRAT  memory  and  the  memory  used  for  objects  in  the  heap  is  ad¬ 
justed  dynamically.  Thus,  accurate  predictions  of  the  average  size  of  an  object  needed  in  [38] 
become  unnecessary. 

•  i\o  memory  is  wasted  keeping  the  flat  hash  table  no  more  than  70%  full.  On  the  other 
hand,  linked  lists  require  one  additional  word  per  BRAT  entry  for  the  links;  however,  it  is 
conceivable  that  BRAT  entries  could  be  stored  contiguously  with  their  objects,  eliminating 
this  waste. 
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TBM 


Figure  4-5.  XLATE  Table  Format 

The  XLATE  table's  position  and  length  are  specified  by  the  MDP  TBM  register.  The  XLATE  table  is  a  two-way  set- 
associative  cache  composed  of  key/data  pairs  of  words.  A  NIL  data  value  specifies  an  invalid  entry.  The  XLATE 
and  PROBE  instructions  provide  hardware  support  for  quickly  looking  up  keys  in  the  cache. 


Figure  4-6.  BRAT  Entry  Format 

Each  BRAT  entry  is  a  linked  list  entry  associating  a  key  word  to  a  data  word. 


BRAT  Routines 

There  are  three  main  routines  for  managing  the  BRAT  table.  They  are: 

•  EnterBinding,  which  enters  a  new  binding  of  a  key  to  a  data  word.  This  routine  uses  a 
binding  from  the  BRATFree  linked  list  whenever  possible.  However,  if  that  list  is  empty, 
memory  is  allocated  from  the  back  of  the  heap,  moving  LastFree  forward  by  three  words, 
which  might  force  a  heap  compaction. 

•  LookupBinding,  which  returns  the  data  word  associated  with  a  key  or  NIL  if  there  is 
none. 

•  DeleteBinding  and  PurgeBinding,  which  remove  a  binding  from  the  BRAT.  The 
binding  must  have  been  present  in  the  BRAT.  In  addition,  PurgeBinding  removes  the  bind¬ 
ing  from  the  XLATE  table. 

Heap  Compaction 

The  current  heap  compactor  in  the  Heap  Manager  does  not  attempt  to  compact  free  BRAT 
entries  linked  on  the  BRATFree  list.  Thus,  once  memory  is  used  for  a  BRAT  entry,  it  can 
only  be  used  for  another  BRAT  entry.  Nevertheless,  performing  such  compaction  by  moving 
BRAT  entries  up  in  memory  would  not  present  any  special  difficulties. 
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BRAT  Hash  Table 


Figure  4-7.  BRAT  Table  Format 

The  BRAT  entries  are  kept  in  linked  lists  rooted  in  the  BFiAT  hash  table.  Free  BRAT  entries  are  linked  in  a  sepa¬ 
rate  linked  list. 


Object  Manager 

The  local  object  manager  combines  the  facility  of  the  heap  manager  and  the  BRAT  manager 
to  provide  a  virtual  name  space  for  the  objects  allocated  by  the  heap  manager.  The  local  ob¬ 
ject  manager  can  allocate  objects  on  the  local  node  and  give  them  unique  names.  The  local 
object  manager  is  tightly  interwoven  with  the  global  object  manager,  so  the  distinction  be¬ 
tween  the  two  managers  is  only  conceptual — their  code  is  inlined  together  in  common  rou¬ 
tines. 
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Class 

TAGO 

CLASS 

0 

Class  Number 

Selector 

TAGO 

SEL 

0 

Selector  Number 

Future 

FUT 

Serial  Number 

Home  Node  Number 
z  !  y  j  X 

Standard  Object 

ID 

0 

Serial  Number 

Home  Node  Number 
z  i  y  t  X 

Dist.  Obj.  Constituent 

ID 

1 

Serial  Number 

Home  Node  Number 
z  !  y  i  X 

Figure  4-8.  Object  ID  Formats 

Words  with  the  above  formats  are  virtual  addresses  of  objects  on  the  heap.  Special  care  must  be  taken  when 
handling  virtual  addresses  which  are  also  futures  to  avoid  forcing  them  prematurely. 


Object  IDs 

The  Object  Manager  recognizes  several  formats  of  object  IDs  and  virtual  addresses,  as  shown 
in  Figure  4-8.  In  addition,  the  Object  Manager  can  generate  unique  new  standard  object  IDs 
by  incrementing  a  local  serial  number  counter  and  adding  it  to  the  local  node  number.  Since 
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no  mechanism  exists  currently  for  reclaiming  IDs,  the  system  will  fail  after  32768  local  ob¬ 
jects  have  been  allocated  at  one  node.  See  Chapter  8  for  a  discussion  of  what  could  be  done 
about  this  problem. 

Each  of  the  IDs  in  Figure  4-8  contains  a  home  node  number  in  the  lowest  16  bits.  For  fu¬ 
tures,  standard  objects,  and  distributed  object  constituents,  the  home  node  number  is  merely 
the  network  number  of  the  MDP  that  serves  as  the  object’s  home;  any  unused  bits  must  be 
zero.  However,  for  classes  and  selectors,  any  of  the  lowest  16  bits  not  used  for  storing  the 
network  number  are  used  to  distinguish  among  several  class  or  selector  objects  sharing  the 
same  home.  For  example,  on  a  1024-node  J-Machine  arranged  as  16x16x4,  bits  0-3,  5-8,  and 
10-11  hold  the  home’s  x,  y,  and  z  coordinates,  respectively,  while  bits  4,  9,  and  12-15  disam¬ 
biguate  among  classes  or  selectors  living  on  the  same  node.  For  this  configuration,  the  class 
object’s  home  node  number  can  be  obtained  by  logically  ANDing  the  class  number  with 
%0000110111101111.  See  Figure  5-12  for  more  on  this. 

Why  not  use  bits  16-27  to  disambiguate  classes  and  selectors  living  on  the  same  node,  as  is 
done  for  objects?  The  reason  is  that  several  parts  of  Cosmos  require  class  and  selector  num¬ 
bers  to  be  no  greater  than  16  bits.  For  instance,  a  class  number  is  stored  in  every  heap  ob¬ 
ject’s  header,  and  the  class  and  selector  numbers  are  concatenated  to  make  a  32-bit  word 
during  method  lookup. 

Routines 

The  local  object  manager  provides  routines  to  allocate  and  deallocate  objects.  The  object-allo¬ 
cating  routine  has  two  variants — AllocNewOb ject  allocates  an  object  given  its  ID  and 
header  word,  while  AllocNextOb  ject  takes  a  header  word  and  generates  a  new  ID  for  the 
object.  Both  variants  then  allocate  local  memory  for  the  object  and  enter  the  binding  of  the 
ID  to  the  physical  address  in  the  BRAT  and  the  XLATE  tables.  AllocNextOb  ject  is  used 
for  most  of  the  general  object-allocating  needs,  while  AllocNewOb  ject  is  used  in  special 
cases — downloading  of  objects  or  allocation  of  distributed  object  constituents — where  an  ob¬ 
ject’s  ID  is  predetermined. 

DeallocateOb  ject,  the  local  object  deallocator,  deletes  an  object’s  bindings  from  the  BRAT 
and  the  XLATE  tables  and  sets  the  object’s  deleted  flag.  Thus,  the  object  will  be  compacted 
during  the  next  heap  compaction.  If  the  object  was  a  distributed  object  constituent,  it  might 
have  had  more  than  one  binding  in  the  XLATE  table;  only  one  such  binding  is  deleted,  so  it 
might  still  be  possible  to  access  a  deleted  constituent  object  through  the  other  bindings  until 
the  second  heap  compaction.  This  is  not  an  error  because  the  consequences  of  accessing  a 
deleted  object  in  Concurrent  Smalltalk  are  undefined. 

The  object  manager  also  provides  a  handler  for  XLATE  faults.  When  an  XLATE  instruction 
that  searches  for  a  local  object  misses,  the  object  manager  searches  the  BRAT  for  the  binding. 
If  it  finds  such  a  binding,  it  returns  the  object’s  physical  address  and  enters  the  object’s 
binding  back  in  the  XLATE  table.  This  is  also  the  point  at  which  the  heap  manager  unmarks 
the  object  if  it  was  previously  marked.  If  the  object’s  binding  was  not  found  in  the  BRAT, 
further  action  depends  on  the  value  of  the  XLATE  action  code’ — the  XLATE  fault  handler 
might  use  the  global  object  manager  to  bring  the  object  onto  this  node,  return  NIL,  or  fail. 

Context  Manager 

The  context  manager  maintains  contexts  which  contain  local  variables  and  saved  register 
values  and  messages  of  processes.  The  structure  of  a  context  is  shown  in  Figure  4-9.  MDP’s 
register  IDl  contains  a  virtual  address  of  the  current  context  at  all  times  when  a  context 
switch  is  possible,  while  A1  contains  the  physical  address  and  length  of  the  context.  Contexts 
are  used  for  the  following  purposes: 


’The  XLATE  action  code  tells  the  XLATE  handler  what  the  user  of  the  XLATE  instruction  wanted  to  accomplish. 
The  action  code  conveys  information  such  as  whether  the  caller  really  needed  to  reference  an  object  (and  the  object 
should  be  brought  locally  if  it  isn’t  present)  or  the  caller  only  wanted  to  tell  if  the  object  exists. 
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Figure  4-9.  Context  Format 

Standard  and  fast  contexts  have  the  above  format  except  that  they  are  only  25  words  long,  while  long  contexts  can 
be  up  to  64  words  long  (the  MDP  only  allows  convenient  addressing  of  the  first  64  words  of  an  object). 

There  is  no  saved  101  field  because  I01  points  to  the  context  itself,  so  it  has  to  be  known  by  whatever  routine  is 
resuming  the  context. 

The  link  field  is  used  for  several  purposes.  Contexts  on  the  Fastco.-.textQueue  are  linked  together  by  their  link 
fields.  When  a  process  suspends  execution,  the  resumption  condition  is  stored  in  the  link  field:  if  the  process 
suspended  because  it  read  a  cfuture  from  a  local  variable  in  the  context,  the  offset  (tagged  CFUT)  of  that  local 
variable  is  stored  in  the  link  field.  If  the  process  suspended  because  it  referenced  a  non-local  object,  the  context  is 
put  on  a  linked  list  of  contexts  waiting  for  the  object  rooted  at  the  object's  BRAT  binding.  The  old  data  value  of  the 
BRAT  binding  is  placed  in  the  link  field  of  the  last  context  waiting  for  the  object.  Since  the  data  value  of  a  BRAT 
entry  can  be  an  integer,  the  INT  tag  cannot  be  used  to  represent  contexts  waiting  for  cfutures 

•  When  a  function  calls  another  function,  it  stores  a  cfuture  in  a  local  variable  in  its  con¬ 
text  and  then  proceeds  to  fault  on  that  cfuture.  The  reply  from  the  called  function  will  store 
its  value  into  the  designated  local  variable,  overwriting  the  cfuture. 

•  When  evaluation  of  a  function  needs  to  be  suspended  for  any  reason,  including  a  cfuture 
fault,  the  function’s  registers  are  saved  in  a  context. 

•  When  evaluation  of  a  function  is  suspended,  the  message  that  invoked  that  function  is 
copied  into  the  beginning  of  the  context  (except  for  the  first  two  words  of  the  message,  which 
are  then  lost).  When  the  function  resumes,  A3,  the  register  which  originally  pointed  to  the 
message,  is  aliased  to  point  to  the  context  to  allow  the  function  to  use  A3  to  refer  to  the  in- 
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coming  message  regardless  of  whether  the  message  has  been  copied  into  the  context  yet  or 
not. 

Context  Availability 

There  are  four  fundamental  approaches  to  allocating  contexts; 

1.  Always  allocate  a  context  at  the  beginning  of  every  function 

2.  Allocate  a  context  at  the  beginning  of  a  function  that  needs 
the  end. 

3.  Lazily  allocate  contexts  only  when  necessary. 

4.  Always  keep  a  context  allocated,  even  when  no  message  is  being  processed. 

Approaches  1  and  2  are  commonly  used  for  stack  frames  on  stack-based  computers.  Initially 
I  chose  approach  3  for  the  context  allocation  strategy.  Approaches  1  and  2  are  simpler  but 
have  the  disadvantage  of  often  allocating  unnecessary  contexts — most  of  the  leaf  nodes  of 
computations  do  not  require  contexts,  and  allocating  contexts  unnecessarily  is  a  considerable 
overhead.  Approach  3  worked  by  storing  an  invalid  address  in  Al,  the  MDP’s  context  ad¬ 
dress  register.  When  a  context  was  needed,  the  access  through  Al  would  fault,  and  a  context 
would  be  allocated.  However,  I  ran  into  two  difficulties  with  approach  3:  allocating  contexts 
through  faulting  on  Al  was  slow  because  determining  the  cause  of  an  INVADR  fault  on  the 
MDP  is  quite  involved,  and  there  were  some  difficult  code  sections  in  the  object  manager 
where  a  fault  might  allocate  a  context,  violating  criticality  rules. 

Due  to  the  above  difficulties,  I  switched  to  approach  4,  which  combines  the  advantages  of 
lazy  context  allocation  with  the  advantages  of  always  allocating  a  context.  In  approach  4, 
when  a  function  finishes  executing,  it  does  not  deallocate  its  context*;  thus,  the  next  message 
that  arrives  does  not  have  to  allocate  a  context.  There  are  two  places  where  approach  4 
involves  a  little  extra  work  than  approach  3:  when  a  function  suspends  on  a  efuture  or  object 
migration  wait,  it  must  allocate  a  new  context  to  avoid  having  its  own  context  overwritten; 
and  when  the  value  of  a  efuture  is  returned  or  an  object  arrives,  the  currently  allocated  con¬ 
text  must  be  deallocated  and  replaced  with  the  suspended  function’s  context.  The  additional 
context  allocation  on  a  efuture  or  object  migration  wait  is  not  a  significant  penalty  because  it 
occurs  on  the  tail  end  of  message  processing — it  does  not  affect  the  latency  of  message  pro¬ 
cessing  until  the  J-Machine  is  fully  loaded.  The  context  deallocation  on  the  reception  of  a 
efuture  value  or  an  object  does  add  to  the  latency,  but  context  deallocation  is  always  fast — it 
takes  only  four  instructions. 

To  avoid  reentrancy  and  criticality  problems,  the  value  in  register  Al  is  required  to  be  always 
valid;  therefore,  any  routine,  such  as  the  heap  compactor,  which  might  invalidate  Al  must 
recalculate  the  value  of  Al  when  it  is  done. 

Kinds  of  Contexts 

There  are  three  kinds  of  contexts:  fast  contexts,  standard  contexts,  and  long  contexts.  A 
fixed  number  of  fast  contexts  is  preallocated  when  an  MDP  is  initialized.  Each  fast  context  is 
25  words  long.  The  fast  contexts  are  nonrelocatable  heap  objects  between  FixedHeapStart 
and  HeapStart.  The  physical  addresses  of  these  contexts  never  need  to  be  invalidated,  so 
these  contexts  are  especially  fast.  Fast  contexts  are  never  deallocated.  Enough  of  these  con¬ 
texts  should  be  allocated  to  serve  a  normal  computation  load  on  an  MDP;  the  current  operat¬ 
ing  system  allocates  eight  per  MDP,  which  is  probably  too  few. 

St  indard  contexts  are  like  fast  contexts  in  that  they  are  25  words  long,  but  they  are  relocat¬ 
able  objects  allocated  from  the  main  heap  area;  thus  a  heap  compaction  invalidates  their 


and  deallocate  it  at  the  end. 
a  context  and  deallocate  it  at 


'This  only  applies  to  functions  which  use  25-word  contexts;  functions  which  use  long  contexts  must  deallocate  their 
contexts  and  allocate  25-word  contexts  upon  exiting. 
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physical  addresses.  Unlike  fast  contexts,  the  storage  occupied  by  standard  contexts  can  be 
reclaimed. 

Fast  contexts  and  standard  contexts  are  eligible  to  be  queued  on  a  linked  list  of  free  contexts 
rooted  by  the  global  variable  FastContext Queue.  Whenever  a  25-word  context  is  desired, 
FastContextQueue  is  checked  first;  if  it  contains  a  context,  that  context  is  unlinked  from 
the  queue  and  used.  Otherwise,  a  standard  context  is  allocated.  When  a  fast  context  is  dis¬ 
posed,  it  is  linked  back  on  the  queue.  When  a  standard  context  is  disposed,  it  is  either  liked 
back  on  the  queue  or  deallocated,  at  the  caller’s  discretion.  These  queue  operations  are 
fast — allocating  a  context  from  the  queue  takes  five  instructions,  while  deallocating  one  onto 
the  queue  takes  four. 

Long  contexts  are  contexts  for  functions  which  require  extra  space  for  local  variables.  Long 
contexts  are  identical  to  standard  contexts  except  that  they  are  longer  and  ineligible  for 
queueing  on  the  FastContextQueue.  When  a  function  that  might  need  a  long  context  starts 
executing,  it  calls  the  NewContext  routine,  which  replaces  the  present  context  with  a  newly 
created  long  context.  NewContext  also  copies  any  relevant  state  such  as  the  message  from 
the  fast  cotitext  to  the  new,  long  context.  A  function  which  allocates  a  long  context  must 
terminate  with  a  call  to  Suspend,  which  disposes  the  long  context  and  allocates  a  new  fast  or 
standard  context.  DisposeContext  can  be  used  to  dispose  a  context  without  allocating  a 
new  one. 

Allocation  and  Deallocation  Calls 

The  routines  to  allocate  and  deallocate  25-word  contexts  are  short  enough  that  they  are  in- 
lined  whenever  they  are  needed.  The  following  calls  are  available  for  handling  long  contexts 
and  the  case  in  which  the  FastContextQueue  is  empty: 

•  AllocFastContext  creates  a  new  fast  context  when  the  queue  is  empty. 

•  Suspend  checks  whether  a  fast  context  was  used  by  the  routine.  If  so,  it  links  it  into  the 
fast  context  queue;  otherwise,  the  context  is  disposed  by  the  heap  manager,  and  a  new  25- 
word  context  allocated, 

•  NewContext  allocates  a  new  long  context.  If  a  context  is  currently  in  use,  it  is  deallo¬ 
cated  after  the  message  has  been  copied  from  it  to  the  new  context. 

•  DisposeContext  is  like  Suspend  except  that  it  does  not  allocate  a  new  25- word  context. 

Suspending  and  Resuming  Processes 

When  a  process  must  be  suspended  because  it  tried  to  read  a  efuture,  perform  an  operation 
on  a  future,  add  two  user-defined  objects  together,  or  reference  a  nonlocal  object,  the  process’s 
state  must  be  saved  in  its  context.  In  particular,  the  values  of  registers  that  need  to  be  pre¬ 
served  must  be  stored  in  the  context  along  with  the  IP  at  which  execution  should  resume. 
Furthermore,  the  reason  for  suspending  must  be  stored  in  the  link  field  of  the  context;  other¬ 
wise,  the  context  might  be  restarted  prematurely,  which  would  lead  to  a  disaster  if  the  con¬ 
text  was  waiting  for  an  object'.  Finally,  a  new  25- word  context  is  allocated  in  A1  and  IDl  to 
prevent  the  suspended  context  from  being  reused. 

When  a  process  is  to  be  resumed,  the  resuming  event  is  checked  against  the  context’s  link 
field  to  make  sure  that  the  context  should,  in  fact,  be  resumed.  If  it  should,  the  existing  con¬ 
text  in  A1  and  IDl  is  deallocated,  and  the  values  of  the  registers  and  IP  read  from  the  con- 


'The  reason  why  restarting  a  context  early  would  crash  the  computer  is  not  obvious.  The  problem  is  not  that  the 
process  would  access  a  bad  object  or  value — the  process  would  fault  and  suspend  again  because  it  still  cannot  refer¬ 
ence  the  nonlocal  object.  Instead,  the  system  crash  would  occur  because  if  a  context  that  had  been  waiting  for  mi¬ 
gration  of  a  nonlocal  object  were  restarted  early,  the  context  would  not  bo  unlinked  from  the  list  of  contexts  waiting 
for  the  object.  The  Reply  handler  would  not  even  be  aware  that  the  context  hod  boon  present  on  the  linked  list. 
Then,  when  the  context’s  process  faulted  again  on  the  missing  objc-ct,  it  would  bo  added  to  the  list  of  contexts  wait¬ 
ing  for  the  object  a  second  time,  corrupting  that  list. 
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text.  If  a  resource  for  which  several  processes  were  waiting  arrives,  one  of  these  processes  is 
resumed  immediately,  while  the  other  ones  are  resumed  later  by  RestartContext  messages 
(Figure  4-10)  which  the  node  sends  it  itself.  A  RestartContext  message  deallocates  the 
existing  context  in  A1  and  IDI  and  then  restarts  the  specified  context. 


MSG 

RestartContext 

2 

ID 

Context  ID 

Figure  4-10.  RestartContext  Message 

The  RestartContext.  message  restarts  the  context  specified  by  the  ID.  The  context  must  be  present  on  the  tar¬ 
get  node. 


Reclaiming  Contexts 

The  current  strategy  for  reclaiming  free  contexts  by  the  heap  compactor  is  somewhat  hap¬ 
hazard.  Fast  contexts  are  never  reclaimed.  Long  contexts  are  always  reclaimed  because 
they  are  required  to  be  deallocated  before  their  processes  can  exit.  On  the  other  hand,  stan¬ 
dard  contexts  are  reclaimed  only  if  enough  processes  call  Suspend  when  they  are  done; 
otherwise,  once  a  standard  context  is  allocated,  it  is  never  deallocated.  This  may  be  an  ad¬ 
vantage  because  once  a  working  set  of  fast  and  standard  contexts  is  allocated  on  an  MDP,  al¬ 
location  of  25-word  contexts  will  always  be  fast.  If  the  lack  of  regular  deallocation  of  stan¬ 
dard  contexts  turns  out  to  be  a  problem,  it  would  only  be  a  simple  modification  to  the  heap 
compactor  to  have  it  scan  the  FastContextQueue  and  deallocate  any  standard  contexts  it 
finds  there. 

Global  Object  Manager 

The  primary  means  of  invoking  the  global  object  manager  is  through  the  local  object  manager 
when  the  latter  cannot  find  a  local  object.  The  global  object  manager  extends  the  local  object 
manager  to  the  global  virtual  address  space  of  the  J-Machine.  Together,  the  two  managers 
provide  an  integrated  facility  for  efficiently  managing  objects  globally  on  the  J-Machine.  The 
managers  can  distinguish  mutable  objects  from  immutable  ones  and  cache  copies  of  the  im¬ 
mutable  objects  on  many  nodes. 

Data  Structures 

Every  object  on  the  J-Machine  has  a  home  node.  The  home  node  most  likely  created  the  ob¬ 
ject,  and  that  node  has  the  responsibility  of  keeping  track  of  the  object’s  location  throughout 
the  object’s  life.  Objects  may  migrate  from  node  to  node,  but  the  object  must  inform  the  home 
node  of  every  such  move.  If  a  node  needs  an  object  and  does  not  know  where  it  is,  it  asks  the 
home  node.  Certain  objects  such  as  contexts,  selectors,  and  immutable  objects  do  not  mi¬ 
grate,  so  such  objects  can  always  be  found  at  their  home  nodes.  The  address  of  the  home 
node  is  usually  encoded  in  the  lowest  16  bits  of  an  object’s  ID  (see  ’^  'jure  4-8).  This  is  a  con¬ 
venient  format  because  the  network  ignores  the  upper  16  bits  of  a  routing  address,  so  mes¬ 
sages  may  be  sent  to  an  object’s  home  node  simply  by  transmitting  the  object’s  ID  as  the 
routing  word. 

In  addition  to  the  flags  used  by  the  local  object  manager,  each  object  has  three  additional 
flags:  copyable,  purgeable,  and  locked.  An  object  is  copyable  if  it  is  immutable.  Many  primi¬ 
tive  objects  are  immutable,  as  are  objects  belonging  to  classes  declared  immutable  by  the 
Concurrent  Smalltalk  programmer.  Furthermore,  the  compiler  might  be  able  to  determine 
that  objects  of  a  particular  .lass  cannot  be  mutated  and  mark  then  copyable,  although  the 
compiler  does  not  perform  this  optimization  at  this  time.  When  a  cop_\-  of  a  copy  able  object  is 
made,  the  copy  is  marked  purgeable.  Thus,  many  '-opies  of  immutable  objects  can  be  made, 
and  the  heap  compactor  can  reclaim  storage  used  by  copies  that  are  no  longer  needed,  Set- 
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ting  the  locked  flag  prevents  an  object  from  migrating  or  being  deleted  during  critical  proto¬ 
col  sections. 


XLATE  Entry 
(I)  None 


BRAT  Entry 
None 


(ID 

ID 

ADDR 

ID 

ADDR 

Next 


Contexts 


Figure  4-11.  Object  XLATE  Table  and  BRAT  Entries 

There  are  five  possible  BRAT  table  states  tor  a  particular  object  Each  object  must  have  a  BRAT  entry  on  its 
home  node  The  XLATE  table  entry,  where  specified,  is  optional  The  states  are  as  follows: 

I  The  object  does  not  exist  on  this  node,  and  its  whereabouts  are  unknown 

II  The  object  exists  on  this  node  Its  physical  address  is  given 

III  The  object  does  not  exist  on  this  node  but  it  is  believed  to  reside  on  the  node  sjDecitied  by  the  integer 

IV  The  object  does  not  exist  on  this  node,  but  the  contexts  linked  to  its  BRAT  entry  are  waiting  for  its  arrival 

V  The  object  does  not  exist  on  this  node,  but  the  contexts  linked  to  its  BRAT  entry  are  waiting  for  its  arrival,  and 

the  object  is  believed  to  reside  on  the  node  specified  by  the  integer 

Only  states  II.  Ill,  and  V  are  allowed  on  an  object  s  homo  node,  while  only  states  I.  II  and  IV  are  allowed  on  the 
other  nodes 
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Every  object  must  always  have  a  BRAT  entry  on  its  home  node.  The  BRAT  entry  can  be  in 
one  of  the  states  shown  in  Figure  4-11.  When  an  object  is  initially  allocated,  its  BRAT  entry 
is  in  state  II.  If  an  object  is  in  state  I  on  its  home  node,  that  object  does  not  exist,  and  any  at¬ 
tempt  to  access  it  halts  the  system. 

Object  Migration 

The  object  migration  protocol  is  a  slightly  simplified  version  of  the  protocol  in  [38].  When  a 
node  requests  an  object  because  it  does  not  have  the  object  in  local  memory,  it  sends  a 
RequestOb  ject  message  (Figure  4-12c)  to  the  object’s  home  node.  If  the  home  node  does  not 
currently  have  the  object  (its  BRAT  table  entry  is  in  states  III  or  V),  it  forwards  the 
RequestOb  ject  message  to  the  node  thought  to  contain  the  object.  If  the  home  node  does 
not  know  about  the  object  (BRAT  state  I),  it  halts  the  system.  This  halt  is  deliberate,  for  it 
detects  accesses  to  deleted  objects.  If  the  RequestOb  ject  message  was  forwarded  to  a  node 
that  has  the  object,  the  message  is  processed  there;  otherwise,  that  node  forwards  the 
RequestOb  ject  message  back  to  the  home  node,  and  the  two  nodes  keep  forwarding  the 
message  to  each  other.  Nevertheless,  since  the  home  node  is  required  to  know  an  object’s 
whereabouts  most  of  the  time,  the  home  node  will  eventually  learn  of  the  object’s  true  loca¬ 
tion  and  forward  the  RequestOb  ject  message  to  the  right  place. 
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Figure  4-12.  Object  Migration  Messages 

The  Accept  Ob  :cct  and  AcKccw-eatie:;  f:  t  messages  are  used  only  for  downloading  objects  into  the  J-Ma- 
chine  and  for  debugging  The  other  four  messages  are  used  lor  successive  steps  of  object  migration 
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(a)  Copying  a  Copyable  Object 
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(b)  Migrating  a  Mutable  Object 
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Figure  4-13.  Object  Migration  Protocol 

When  a  copy  of  an  immutable  object  is  made,  the  copy  is  simply  sent  to  the  requester  as  in  part  (a)  If  a  mutable 
object  has  to  be  moved,  the  protocol  is  more  complicated  because  the  object's  home  node  has  to  be  kept  informed 
about  the  object's  location. 

What  happens  when  the  RequestOb  ject  message  finds  the  object  depends  on  whether  the 
object  is  copyable  or  locked.  If  the  object  is  locked,  the  node  forwards  the  message  back  to  it¬ 
self;  the  message  will  be  handled  once  the  object  is  unlocked.  If  the  object  is  copyable,  the 
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node  simply  mails  a  purgeable,  copyable  copy  of  the  object  in  a  MigrateOb  ject  message  to 
the  requesting  node,  which  then  installs  the  copy  in  its  memory  (Figure  4- 13a).  If  not,  the 
protocol  becomes  more  complicated  (Figure  4- 13b).  The  node  on  which  the  object  is  residing 
deletes  the  object  from  its  memory  and  BRAT  and  sends  the  object  to  the  requesting  node  in 
a  MigrateOb  ject  message.  The  requesting  node  installs  the  object  in  its  memory,  locks  it, 
and  sends  an  UpdateHome  message  to  the  birthnode,  telling  it  about  the  object’s  new  where¬ 
abouts.  Finally,  the  birthnode  sends  an  Unlock  message  to  acknowledge  receipt  of  the 
UpdateHome  message  and  allow  the  object  to  be  moved  again.  Since  a  locked  object  might 
have  been  deleted,  the  Unlock  message  checks  the  object’s  deleted  flag  and  deletes  it  and  its 
BRAT  entry  if  it  was  set.  The  last  two  messages  are  optimized  out  if  the  requesting  node 
happens  to  be  the  object’s  home  node. 

The  object  is  locked  in  the  last  phase  of  the  protocol  to  prevent  the  home  node  from  receiving 
the  UpdateHome  messages  from  two  successive  migrations  out  of  order;  if  that  were  to  hap¬ 
pen,  the  home  node  would  lose  track  of  the  object’s  location.  Alternatively,  counters  could  be 
used  to  achieve  the  same  synchronization,  but  that  solution  would  require  an  extra  word  in 
the  BRAT  and  in  the  object. 

Object  Allocation  and  Deletion 

An  object  can  be  allocated  either  at  the  local  node  or  on  a  remote  node.  The  NewLocalOb- 
ject  system  call  allocates  an  object  locally.  Unlike  the  AllocNextob ject  call, 
NewLocalOb ject  takes  a  class  as  a  parameter  and  extracts  the  appropriate  header  word 
from  the  class  object  (Figure  4-14)  to  use  for  the  object.  Reading  the  class  object  may  involve 
another  call  to  the  global  object  manager  if  a  copy  of  the  class  object  is  not  present  in  local 
memory. 
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Figure  4-14.  Class  Object  Format 

The  instance  object  header  word  is  the  word  that  is  stored  as  the  header  ot  every  object  ol  this  dass  That  word  is 
ni  1  if  the  metaclass  iSprimitive-dass 

In  addition,  each  class  object  contains  an  ordered  list  of  the  class's  ancestors  from  the  most  specific  to  the  least 
specific.  The  class's  ancestors  consist  of  the  class  itself,  its  superclasses,  its  superclasses'  superdasses,  and  so 
on:  each  class  is  listed  at  most  once.  The  ancestors  are  ordered  according  to  a  partial  order  which  always  places 
a  dass  before  any  of  its  superdasses;  thus  the  dass  itself  is  always  the  fust  ancestor  and  object  is  always  the 
last  ancestor. 

The  DisposeOb  ject  system  call  is  used  to  dispose  objects,  both  locally  and  globally.  Dis- 
poseOb  ject  first  tries  to  dispose  the  object  locally;  if  the  object  is  locked,  it  is  marked  as 
deleted  but  not  disposed;  it  will  be  disposed  when  it  is  unlocked.  If  the  object  does  not  reside 
on  this  node,  a  Dispose  message  is  sent  to  the  object’s  home  node,  which  follows  a  route 
analogous  to  the  RequestOb  ject  message  above  and  will  not  be  discussed  further.  If  the  ob¬ 
ject  is  present  on  this  node  but  this  is  not  the  object’s  home  node,  a  DisposeBRAT  message  is 
sent  to  the  home  node  to  dispose  the  object’s  BRAT  entry  there.  If  the  DisposeBRAT  mes¬ 
sage  happens  to  find  another  instance  of  the  object  on  its  home  node,  it  deletes  that  instance 
too. 
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(c)  DisposeBRAT  Message 
Figure  4-15.  Object  Creation  and  Disposal  Messages 

The  NewOb  ject  message  creates  an  object  of  the  given  class  on  the  remote  node  and  returns  its  ID  in  a  Reply 
message.  The  Dispose  message  disposes  an  object  on  the  remote  node,  while  the  DisposeBRAT  message 
disposes  an  object's  home  BRAT  entry. 
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This  protocol  successfully  deletes  the  single  instance  of  a  mutable  object  and  the  unpurgeable 
original  of  an  immutable  object  along  with,  perhaps,  one  copy.  Other  copies,  if  any  exist,  of 
an  immutable  object  are  not  disposed;  however,  they  will  simply  be  purged  out  if  they  are  not 
referenced  for  a  while. 

Other  Services 

The  global  object  manager  provides  two  routines,  ClassOf  and  TypeOf ,  that  can  determine 
the  class  of  any  of  the  objects  listed  in  Figure  4-2.  If  the  object  is  a  primitive  object,  the 
global  object  manager  returns  it  class  directly.  Otherwise,  the  global  object  manager  extracts 
the  class  from  the  object’s  header  and  returns.  In  addition,  the  global  object  manager  pro¬ 
vides  the  Ob  jectNode  routine  which  returns  the  node  number  of  a  node  likely  to  contain  the 
object.  If  the  object  is  primitive,  Ob  jectNode  returns  a  random  node  number.  This  system 
call  is  frequently  used  in  Concurrent  Smalltalk  to  determine  the  node  to  which  an  applica¬ 
tion  message  should  be  sent. 

The  global  object  manager  actively  participates  in  the  process  of  downloading  a  Concurrent 
Smalltalk  program  to  the  J-Machine.  It  provides  support  for  installing  objects  on  nodes 
without  migrating  them  from  anywhere.  If  a  node  receives  an  AcceptObject  message 
(Figure  4- 12a),  it  install;  *Vie  object  and  its  ID  in  its  memory  and  the  BRAT  and  responds 
with  an  AcknowledgeOb  ject  message  (Figure  4-12b)  containing  the  object’s  ID. 

To  avoid  difficulties  with  downloading  objects  recursively  referencing  each  other,  object  IDs 
are  assigned  by  MDPSim  (see  the  section  about  late-binding  references  in  [25])  before  the 
objects  are  downloaded  into  the  J-Machine;  hence,  an  MDP  accepting  an  object  must  also  ac¬ 
cept  the  object’s  ID  instead  of  generating  a  new  one.  The  IDs  assigned  by  MDPSim  use  serial 
numbers  in  the  upper  range  of  the  allowed  numbers,  thus  preventing  ID  conflicts  with  ob¬ 
jects  generated  at  runtime. 

Finally,  the  global  object  manager  provides  support  for  distributed  objects.  This  support  is 
documented  in  the  distributed  object  section  later. 


Initialization 

Upon  powerup  each  MDP  performs  the  following  actions: 
*  Clear  the  address  and  ID  registers  at  all  priorities. 
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•  Clear  the  globals  to  CFUT-tagged  words.  If  an  uninitialized  global  is  accidentally  refer¬ 
enced,  the  MDP  will  halt  because  the  cfuture  handler  can  distinguish  a  valid  cfuture  from  a 
CFUT-tagged  word  that  just  indicates  an  uninitialized  value. 

•  Clear  the  XLATE  table  and  the  BRAT  root  table  to  nil. 

•  Initialize  and  enable  the  network  queues,  but  block  network  message  dispatching  until 
initialization  is  done. 

•  Clear  the  heap  to  CFUT-tagged  words. 

•  Initialize  the  global  variables  that  need  initializing. 

•  Create  eight  nonrelocatable  fast  contexts,  link  them  onto  FastContextQueue,  and  ini¬ 
tialize  HeapStart  to  the  first  word  after  those  contexts. 

•  Unlink  one  fast  context  and  point  priority  O’s  A1  and  IDl  to  it. 

•  Enable  message  dispatching  and  fall  into  an  infinite  loop  in  background  mode. 

The  version  of  Cosmos  for  running  on  a  real  J-Machine  instead  of  MDPSim  has  a  startup  se¬ 
quence  that  also  includes  a  self-test  of  the  CPU,  a  memory  test,  a  network  test,  debugging 
utilities,  and  a  protocol  to  let  each  MDP  determine  its  location  on  the  network. 

Downloading  Programs 

In  the  MDPSim  emulation  of  the  J-Machine,  a  special  non-MDP  network  node  called  the  I/O 
Node  acts  as  the  bridge  between  the  compiler  and  the  J-Machine.  The  compiler  outputs  an 
MDPSim  script  which  queues  a  series  of  objects  in  the  I/O  Node.  The  I/O  Node  then  sends 
AcceptObject  messages  to  the  appropriate  nodes,  waits  for  the  AcknowledgeOb ject 
replies,  and  sends  more  objects  until  all  objects  have  been  downloaded. 

On  the  real  J-Machine,  Concurrent  Smalltalk  programs  are  also  downloaded  through  a  MDP 
that  includes  special  software  to  communicate  with  the  outside  world.  Each  MDP  contains  a 
diagnostic  port  that  lets  the  user  halt  the  MDP  and  directly  examine  and  change  its  memory 
and  state.  The  Cosmos  kernel  is  loaded  onto  the  MDPs  through  these  diagnostic  ports. 
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4.3.  The  Cosmos  Higher-Level  Facilities 

Method  Manager 

The  method  manager  associates  class/selector  pairs  with  methods,  although  it  could  also  be 
used  for  keeping  general  immutable  associations.  It  provides  only  one  routine,  Lookup- 
Method,  with  a  variant,  LookupMethodU,  which  performs  less  processing  of  its  arguments  to 
make  it  more  efficient.  LookupMethod  takes  a  class  word  and  a  selector  word  and  attempts 
to  find  the  method  associated  with  them;  it  is  the  equivalent  of  the  Concurrent  Snialltalk 
method  primitive. 
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Figure  4-16.  Class/Selector  Word  Format 

The  Class/Selector  word  is  formed  by  combining  a  16-bit  class  number  with  a  16-bit  selector  number.  The  word  is 
tagged  CS  (which  is  also  the  INSTi  tag)  to  avoid  conflicts  with  other  kinds  of  bindings  stored  in  the  XLATE  table. 

Lookupmethod  first  attempts  to  look  up  the  association  in  the  local  XLATE  cache.  It  com¬ 
bines  the  16-bit  class  and  selector  numbers  into  a  single  word,  tags  that  word  CS  (Figure  4- 
16),  and  looks  for  a  binding  in  the  XLATE  table.  If  it  finds  a  binding,  the  binding’s  data  word 
is  immediately  returned  as  the  desired  method.  If  no  such  binding  exists,  Lookupmethod 
sends  a  Lookupmethod  message  (Figure  4-l8a)  to  the  selector’s  home  node.  The  message 
will  invoke  the  LookupMethod  runtime  function  on  the  selector  and  the  class. 

The  LookupMethod  runtime  function  executes  on  the  same  node  as  the  selector  object 
(Figure  4-17).  Each  selector  has  a  list  of  methods  defined  for  it  together  with  their  classes. 
LookupMethod  first  tries  to  find  the  given  class  in  the  selector  object;  if  it  finds  it,  it  returns 
the  corresponding  method.  If  LookupMethod  cannot  find  a  method  for  the  given  class,  it  gets 
the  class  object  (Figure  4-14)  and  searches  the  selector’s  method  list  for  the  class’s  ancestors 
until  it  either  finds  a  method  or  runs  out  of  ancestors.  In  the  latter  case  the  method  lookup 
fails  and  LookupMethod  returns  nil.  In  either  case  LookupMethod  returns  the  result  in  a 
MethodReply  message  (Figure  4-18b).  The  requesting  node  then  associates  the 
class/selector  pair  with  the  result  in  its  XLATE  table. 

The  method  lookup  strategy  is  conservative  in  the  use  of  space,  taking  space  roughly  propor¬ 
tional  to  the  number  of  methods  defined  in  the  program.  However,  the  method  lookup  time 
suffers  somewhat,  especially  when  a  method  is  requested  corresponding  to  a  deeply  nested 
class  and  a  selector  with  many  methods  defined;  in  the  worst  case  the  method  lookup  time  is 
the  product  of  the  number  of  ancestors  of  a  class  and  the  number  of  methods  defined  for  the 
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selector.  A  binary  search  could  have  been  used  for  searching  the  method  table,  but  it  would 
have  much  worse  constant  factors,  resulting  in  slower  lookup  for  most  methods,  because  the 
MDP  does  not  have  enough  registers  to  support  the  inner  loop  of  a  binary  search. 

The  methods  are  stored  in  selector  objects  indexed  by  the  class  instead  of  storing  them  in 
class  objects  indexed  by  the  selector  because  the  number  of  selectors  is  usually  much  larger 
than  the  number  of  classes,  and  selectors  tend  to  be  accessed  more  uniformly  than  classes; 
thus,  the  method  lookup  table  can  be  distributed  more  evenly  on  the  J-Machine. 
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Figure  4-17.  Selector  Object  Format 

Each  selector  object  contains  a  table  associating  classes  to  methods. 
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Figure  4-18.  Method  Manager  Messages 

The  LookupMethod  message  requests  a  lookup  of  the  class  and  selector  to  get  NIL  or  a  method  ID,  the  Method 
Reply  message  replies  to  the  lookup. 


Control  Manager 

The  control  manager  dispatches  function  and  method  calls  and  handles  replying  from  func¬ 
tions,  a  task  shared  with  the  context  and  global  object  managers.  The  control  manager’s  code 
is  relatively  short  because  so  much  groundwork  has  been  laid  by  the  previous  managers. 

Function  and  Method  Dispatch 

The  control  manager  handles  three  types  of  messages  for  calling  functions  and  methods: 
Apply,  ApplyFunction,  and  ApplySelector  (Figure  4-19).  The  first  message  can  be  used 
for  applying  an  arbitrary  object — a  function  or  a  selector,  while  the  other  two  messages  can 
only  be  used  for  applying  functions  or  selectors,  respectively.  The  Apply  handler  checks  the 
type  of  its  argument  and  jumps  into  either  the  ApplyFunction  or  ApplySelector  handier, 
as  appropriate;  the  check  takes  three  to  five  instructions. 
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Figure  4-19.  Application  Messages 

The  Apply,  ApplyFunction,  and  ApplySelector  messages  have  identical  fomriats  except  for  the  address 
stored  in  the  header  word.  Each  value  returned  by  the  called  function  corresponds  to  one  two-word  continuation 
passed  to  the  function.  The  continuation  SF>ecifies  either  the  context  ID  and  slot  to  which  that  value  should  be 
replied  or  two  nils  if  that  value  is  ignored  by  the  caller.  The  context  IDs  passed  to  the  function  are  not  necessar¬ 
ily  the  same  due  to  tail  forwarding. 


ApplyFunction  reads  the  ID  of  the  function  from  the  message,  stores  it  in  MDFs  registers 
IDO  and  AO  (the  code  segment  registers),  and  jumps  into  the  fourth  word  of  the  function  ob¬ 
ject  (Figure  4-20).  The  entire  process  takes  only  4  instructions. 

ApplySelector  reads  the  selector  and  the  first  argument  (the  receiver  object)  from  the 
message,  uses  inline  code  to  quickly  determine  the  class  of  the  receiver,  and  calls  Lookup- 
MethodU  to  determine  the  ID  of  the  method  that  should  be  called.  If  the  ID  is  NIL,  Apply¬ 
Selector  halts;  otherwise,  ApplySelector  initializes  IDO  and  AO  and  jumps  into  the 
fourth  word  of  the  function  object.  ApplySelector  takes  23  instructions  in  the  best  case, 
and  considerably  more  if  the  class  of  the  receiver  is  hard  to  determine  or  if  LookupMethodU 
misses  in  the  XLATE  cache. 

Either  of  the  above  handlers  can  suspend  even  before  the  first  instruction  of  the  function  is 
executed  if  the  function  code  or,  in  the  case  of  ApplySelector,  the  receiver  object  is  not  pre¬ 
sent  locally.  Hence,  it  is  important  that  a  valid  context  be  always  present  in  IDI  and  Al.  In 
fact,  a  valid  context  is  present  in  those  registers  as  explained  in  the  context  manager  section. 

Fimction  Calls  and  Replies 

The  control  manager’s  other  task  is  handling  CFUT  faults.  There  are  two  primary  causes  for 
a  CFUT  fault:  a  function  accesses  the  result  of  a  computation  that  has  not  finished  yet,  or 
any  routine  accesses  some  uninitialized  variable.  The  control  manager  distinguishes  these 
two  cases  by  the  data  in  the  CFUT-tagged  word  that  caused  the  fault,  which  is  conveniently 
stored  in  MDP’s  FOPO  register. 

If  the  data  is  positive,  the  fault  was  a  cfuture  fault,  and  the  control  manager  stores  that 
CFUT  word  in  the  current  context’s  link  field  and  suspends  the  context.  The  Optimist  II 
compiler  arranges  for  the  data  portion  of  the  CFUT  word  to  contain  the  offset  of  the  context 
variable  that  was  accessed;  this  way  the  cfuture  handler  does  not  have  to  disassemble  the 
faulted  instruction  to  determine  the  offset.  The  offset  is  needed  later  by  the  Reply  handler 
to  determine  whether  the  context  should  be  restarted. 
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If  the  data  in  the  CFUT-tagged  word  was  zero  or  negative,  the  control  manager  halts  the 
computer  because  an  uninitialized  variable  was  accessed.  On  startup,  all  memory  in  the 
MDFs  heap  is  cleared  to  CFUT:-1. 
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Incoming  message  size  or  nil 

3 

Function  Code 

n-1 

Figure  4-20.  Function  Object  Format 

The  function  object  contains  the  code  for  a  function.  Registers  AO  and  IDO  point  to  the  function  while  it  is  execut¬ 
ing.  The  third  word  contains  the  size  of  the  message  expected  by  the  function  or  NIL  if  the  size  is  not  known  or  the 
function  expects  a  variable  number  of  arguments.  The  compiler  initializes  that  word,  but  the  operating  system 
does  not  check  it  against  the  size  of  tne  message  that  invoked  the  function;  that  check  would  add  at  least  five  in¬ 
structions  to  the  function  dispatch  time. 
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Figure  4-21.  Reply  Message  Format 

The  Repi  y  message  carries  the  reply  value  to  the  specified  slot  in  the  specified  context.  The  context  ID  and  reply 
slot  may  not  be  nil — if  they  were  nil  in  the  Apply  message,  no  Reply  message  is  sent. 

Functions  return  results  to  their  callers  via  Reply  messages  (Figure  4-21).  If  a  function  re¬ 
turns  multiple  values,  it  sends  one  Reply  message  for  each  value  returned.  The  Reply  han¬ 
dler  on  the  caller’s  node  performs  the  following  processing  when  it  receives  the  message: 

1.  The  value  from  the  message  is  stored  over  the  cfuture  in  the  caller’s  context.  However,  if 
the  slot  indicated  in  the  Reply  message  did  not  originally  contain  a  cfuture,  the  Reply  han¬ 
dler  halts  because  some  function  replied  twice  to  the  same  slot  or  the  compiler  generated  in¬ 
correct  code. 

2.  The  CFUT-tagged  link  field  in  the  caller’s  context  is  checked  against  the  slot  number  of 
the  newly  updated  slot.  If  the  numbers  match,  the  context  is  resumed;  otherwise,  the  Reply 
handler  exits  because  the  context  is  waiting  for  some  other  event. 

Actually,  for  reasons  of  efficiency  the  check  in  (1)  is  done  only  if  the  slot  number  in  (2)  doesn’t 
match. 

Utilities 

The  operating  system  kernel  currently  contains  three  utilities:  a  divide  routine,  a  closure 
maker,  and  a  closure  evaluator.  The  Divide  system  call  divides  one  integer  by  another  and 
returns  the  quotient  and  remainder  using  the  sign  conventions  described  in  Appendix  A.  The 
divide  routine  includes  considerable  overhead  to  evaluate  all  signed  32-bit  results  correctly, 
including  special  cases  such  as  dividing  -$80000000  by  1  or  -1  because  a  large-integer 


78 


Chapter  4 


The  Cosmos  Of)erating  System 


package  might  be  implemented  on  top  of  the  normal  integer  arithmetic  routines  sometime  in 
the  future. 

NewClosure,  the  closure  maker,  allocates  and  returns  a  new  closure  object  (Figure  4-22)  on 
the  local  heap.  The  caller  should  then  initialize  the  closure’s  display  arguments  before  using 
the  closure. 

CallClosure  is  the  function  called  by  a  closure  when  it  is  invoked  as  a  function.  CallClo- 
sure  calls  the  function  specified  in  the  closure  with  the  additional  display  arguments  in  the 
closure. 

It  is  true  that  Divide  and  NewClosure  could  have  been  implemented  as  functions  instead  of 
system  calls;  however,  these  routines  are  used  frequently  enough  and  are  short  enough  that 
it  was  decided  that  it  would  be  best  to  make  them  readily  available  whenever  they  are 
needed.  The  additional  overhead  that  would  be  required  in  making  a  function  call  is  compa¬ 
rable  to  the  time  it  takes  to  divide  two  numbers  or  allocate  e  new  closure  object. 
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Figure  4-22.  Closure  Format 

Closures  are  treated  just  like  functions  by  Concurrent  Smalltalk  and  the  control  manager.  When  the  contra!  man¬ 
ager  calls  a  closure,  it  executes  the  instruction  at  offsets,  which  is  a  CallClosure  system  call.  That  system  call 
forwards  the  message  appended  with  the  display  arguments  included  in  the  closure  to  the  function  with  the  ID 
specified  in  the  word  with  offset  4  in  the  closure. 


MDP  Runtime 

The  MDP  runtime  system  contains  utilities  for  which  it  is  not  important  that  they  reside  on 
every  node.  Currently  the  MDP  runtime  system  includes  a  method  lookup  routine  and  two 
routines  that  allocate  distributed  objects  and  are  described  below. 

Distributed  Objects 

A  distributed  object  is  an  object  composed  of  many  constituents.  A  message  sent  to  the  group 
name  of  a  distributed  object  arrives  at  a  constituent  chosen  by  the  operating  system;  the 
hope  is  that  the  operating  system  chooses  the  constituents  evenly  enough  so  as  not  to  over¬ 
load  some  constituents  and  underutilize  others.  In  addition,  each  constituent  of  a  distributed 
object  is  itself  a  Concurrent  Smalltalk  object. 

Distributed  objects  are  supported  by  the  global  object  manager  and  the  MDP  runtime  sys¬ 
tem.  The  MDP  runtime  system  handles  allocation  of  distributed  objects,  while  the  global  ob¬ 
ject  manager  handles  accessing  constituents  of  distributed  objects. 
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Implementation 

Each  distributed  object  is  implemented  solely  as  a  set  of  constituent  objects;  there  is  no 
“group"  data  for  a  distributed  object  anywhere  in  the  system.  The  group  name  of  a  dis¬ 
tributed  object  contains  enough  information  to  permit  quickly  finding  the  ID  of  any  of  its 
constituents  as  well  as  a  convenient  way  to  find  a  nearby  constituent.  The  structure  of  the 
group  name  is  shown  in  Figure  4-23. 
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Figure  4-23.  Distributed  Object  Group  ID 

The  group  ID  (DID)  cortains  the  distributed  object's  serial  number,  linear  “home*  node  number  (explained  in 
Figure  4-24),  and  a  signed  base-2  logarithm  of  the  distributed  object's  stride,  which  is  the  ratio  S  of  the  number  of 
nodes  in  the  J-Machine  to  the  physical  number  of  constituents.  Both  the  physical  number  of  constituents  and  the 
number  of  nodes  in  the  J-Machine  must  be  powers  of  two.  The  Lg(S)  field  is  signed  and  5  bits  long,  ranging  from 
-16  (3=1/65536;  65536N  constituents  on  an  N-node  J-Machine)  to  15  (3=32768;  1  constituent  for  every  32768 
nodes)  by  powers  of  two.  The  linear  home  node  number  H  must  be  less  than  3.  The  kth  constituent,  counting 
from  k=0,  is  located  on  the  node  with  the  linear  number  H+Lk3j. 

If  the  stride  3  is  1  or  greater,  each  constituent  object  has  the  same  serial  number  as  the  group  object.  If  3  is  less 
than  1,  several  constituents  reside  on  every  node  in  the  J-Machine.  and  more  than  one  serial  number  is  required 
to  distinguish  them.  Hence,  the  distributed  object  reserves  1/3  consecutive  constituent  serial  numbers,  and  the 
kth  constituent  has  serial  number  N+(k  mod  1/S)  and  resides  on  the  node  with  the  linear  number  LkSj,  where  N  is 
the  group  name's  serial  number.  H  should  be  zero  in  this  case. 

The  linear  home  node  number  is  used  to  distributed  sparse  distributed  objects  evenly 
throughout  the  J-Machine.  The  linear  home  node  number  is  always  zero  for  dense  dis¬ 
tributed  objects  (ones  with  stride  1  or  less). 

The  physical  size  of  a  distributed  object  has  been  constrained  to  be  a  power  of  two  for  two 
reasons.  First,  it  is  desirable  to  be  able  to  find  any  constituent  from  just  the  information  con¬ 
tained  in  the  DID,  and  encoding  an  arbitrary  distributed  object  size  in  the  DID  would  require 
too  many  bits;  recording  the  logarithm  of  the  size  requires  only  five  bits  for  any  potential 
size.  Second,  unless  some  radically  different  addressing  scheme  were  used,  distributing  the 
constituent  objects  evenly  throughout  the  J-Machine  would  require  a  division  operation  ei¬ 
ther  in  the  Co  routine  or  in  the  PreferredConstituent^  routine. 

A  variant  of  the  current  scheme  has  been  considered  in  which  the  constituents  above  the 
logical  size  of  the  distributed  object  are  not  created.  The  Co  system  call  would  work  fine  in 
such  a  scheme  (except  that  its  range  checking  would  no  longer  be  valid),  but  the  Preferred- 
Constituent  routine  might  return  a  nonexistent  constituent  of  the  distributed  object,  and 
since  it  does  not  know  the  logical  size  of  the  distributed  object,  it  would  not  know  that  the 
constituent  does  not  exist.  It  could,  howev*  r,  inquire  at  the  constituent’s  home  node,  at  the 
expense  of  complicating  and  slowing  down  the  implementation  of  distributed  objects  in  Cos¬ 
mos.  This  variant  may  be  adopted  if  the  loss  of  memory  caused  by  rounding  the  sizes  of  dis¬ 
tributed  objects  up  to  powers  of  two  becomes  too  large. 

Another  consequence  of  rounding  the  sizes  of  distributed  objects  up  to  powers  of  two  is  that 
the  MDPs  with  high  node  numbers  contain  mostly  unused  constituents.  This  difficulty  could 
be  alleviated  by  always  allocating  a  11-bit  random  “home”  node  number,  and  adding  that 
number  to  the  node  number  of  the  constituent  modulo  the  size  of  the  J-Machine,  at  the  ex¬ 
pense  of  complicating  the  Prefer  redConstituent  routine  somewhat.  If  a  J-Machine  has 
more  than  2048  nodes,  bits  could  be  stolen  from  the  serial  number  field  and  added  to  the 
home  node  number  field.  To  avoid  placing  too  severe  a  restriction  on  the  number  of  dis- 
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Figure  4-24.  Looking  up  a  Constituent  in  a  Sparse  Distributed  Object 

This  figure  illustrates  the  co  system  call  looking  up  constituent  5  in  a  16-constituent  distributed  object  on  a  2048- 
node  J-Machine  organi2ed  as  16x  16x3.  The  stride  is  2048/16=128.  so  Ig(strlde)  is  7.  Constituent  0  is  located  on 
the  node  with  the  linear  number  $49.  The  distributed  object's  serial  number  is  $1328. 

Since  the  stride  is  greater  than  1 .  the  constituent  number  5  is  multiplied  by  the  stride  1 28  and  added  to  $49  to  get 
constituent  5's  linear  node  number,  $2C9.  The  dimensions  in  the  linear  node  number  are  packed  together  to 
simplify  arithmetic  operations:  the  co  system  call  unpacks  them  to  get  the  constituent's  ID. 


tributed  objects  in  the  system,  NewDistobj  could  use  both  the  home  node  number  and  the 
serial  node  number  fields  to  distinguish  distributed  objects. 

Locating  Constituents 

The  Co  system  call  implements  Concurrent  Smalltalk’s  co  primitive.  To  find  the  Ath  con¬ 
stituent  ID  of  a  distributed  object,  the  global  object  manager  shifts  k  by  Igistride)  bits  to  the 
left  and  adds  the  linear  home  node  number  to  obtain  the  constituent’s  linear  node  number 
and  ANDs  k  with  a  right-justified  mask  of  max{-lg{Stride),0)  ones  and  adds  it  to  the  serial 
number  from  the  group  object  to  obtain  the  constituent’s  serial  number  (see  Figures  4-24  and 
4-25). 

When  a  message  is  sent  to  the  group  name,  the  translation  from  the  group  name  to  a  con¬ 
stituent  object  happens  transparently  in  the  global  object  manager.  The  Pref er redCon- 
stituent  system  call  also  performs  this  translation.  Just  like  any  ID-to-physical-address 
translation,  the  object  manager  first  checks  the  XLATE  table.  If  it  finds  a  match  for  the  DID 
there,  it  immediately  returns  the  physical  address  from  the  XLATE  table.  If  not,  it  con¬ 
structs  the  ID  of  a  nearby  constituent  by  appending  the  group  serial  number  to  the  local  lin¬ 
ear  node  number  with  the  lowest  maxilg{Stride),0)  bits  replaced  with  the  lowest  bits  from  the 
group  linear  home  node  number.  Then  the  resulting  constituent  ID  is  looked  up  in  the  usual 
object  manager  manner.  If  a  physical  address  of  the  constituent  is  found,  it  is  entered  into 
the  XLATE  table  bound  with  the  DID  to  accelerate  the  lookup  next  time. 

The  above  algorithm  deterministically  maps  every  node  in  the  J-Machine  to  exactly  one  con¬ 
stituent  of  the  distributed  object.  Having  such  a  deterministic  mapping  is  important  because 
a  method  running  on  a  distributed  object  may  reference  the  distributed  object  several  times 
during  its  execution,  and  it  is  very  important  that  it  get  the  same  constituent  every  time. 
For  example,  the  method  might  be  suspended  while  accessing  fields  of  a  constituent.  When 
the  method  restarts  and  references  the  constituent  again,  it  is  important  that  it  refer  to  the 
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Figure  4-25.  Looking  up  a  Constituent  in  a  Dense  Distributed  Object 

This  figure  illustrates  the  co  system  call  looking  up  constituent  25000  in  a  131072-constituent  distributed  object  on 
a  2048-node  J-Machine  organi2ed  as  16x16x8.  The  stride  is  2048/131072=1/64,  so  Ig(stride)  is  -6.  The  home 
node  number  should  be  zero  In  a  dense  object.  The  distributed  object  has  a  block  of  64  reserved  serial  numbers 
starting  with  $1328. 

The  constituent  number  25000  is  multiplied  by  the  stride  1/64  and  added  to  0  to  get  constituent  25000's  linear 
node  number,  $186.  The  constituent's  serial  number  is  determined  by  calculating  25000  MOD  64  and  adding  it  to 
the  base  serial  number.  As  before,  the  dimensions  in  the  linear  node  number  are  unpacked  to  get  the  con¬ 
stituent's  ID, 


same  one.  Since  processes  can’t  migrate  across  nodes,  the  function  will,  in  fact,  refer  to  the 
same  constituent  every  time  it  translates  the  DID  to  a  physical  address. 

The  above  mapping  will  utilize  the  distributed  object’s  constituents  uniformly  if  calls  to  the 
distributed  object  come  from  a  uniform  distribution  of  nodes,  unless  the  stride  is  less  than 
one,  in  which  case  only  one  distributed  object  representative  is  chosen  per  node.  If  the  MDPs 
were  arranged  in  a  linear  array,  the  above  mapping  would  always  yield  either  the  closest  or 
the  second-closest  constituent  to  a  given  node.  Since  the  MDPs  are  actually  arranged  in  a 
two  or  three-dimensional  mesh,  the  mapping  will  tend  to  cluster  the  constituents  in  lines  or 
planes  of  the  mesh,  which  may  or  may  not  produce  favorable  communication  patterns.  Over¬ 
all,  though,  the  current  mapping  approach  does  have  the  advantage  of  simplicity,  and  it  is 
useful  for  small-scale  J-Machines. 

Allocating  Distributed  Objects 

(NewDistobj  classiclass  sizeiinteger)  :distObj  Function 

Distributed  objects  are  allocated  by  calling  the  NewDistobj  function  in  the  MDP  runtime 
system.  That  function  first  checks  whether  it  was  called  on  node  0;  if  not,  it  forwards  its 
message  to  node  0,  and  the  function  is  invoked  there.  If  invoked  on  node  0,  the  function  cal¬ 
culates  the  physical  size  of  the  distributed  object  by  rounding  the  given  logical  size  size  to  the 
nearest  higher  p'^wer  of  two.  Then  the  stride  is  computed  by  dividing  the  number  of  MDPs  in 
the  J-Machine  by  the  physical  size;  since  the  relevant  numbers  are  all  powers  of  two,  the 
computations  are  done  using  base-2  logarithms.  Max{l  I  stride,!)  consecutive  distributed  ob¬ 
ject  serial  numbers  are  allocated  for  this  distributed  object,  and  a  random  home  node  is  cho¬ 
sen  between  0  and  \stride\-!,  inclusive.  A  global  variable  is  used  to  maintain  the  next  free 
DID  number.  Finally,  a  DID  is  constructed  from  the  above  information,  and  a  NewDistob  j- 
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Tree  message  is  sent  to  the  zeroth  constituent  of  the  distributed  object  (which  does  not  exist 
yet,  but  the  Co  function  can  calculate  its  ID  anyway).  When  that  message  returns,  the  DID 
is  returned  to  the  caller. 

(NewDistobjTree  class.'Class  size.'integer  iDrdistObj  start, logDelta.'integer}  .null 

Function 

NewDistob  jTree  creates  constituents  numbered  Start  through  (start+2'°o‘^'®-l)  of  the  dis¬ 
tributed  object  with  the  DID  ID  and  then  returns.  Each  constituent  has  group,  index,  and 
logical  size  instance  variables,  which  are  initialized  to  the  appropriate  values;  Size  is  the  logi¬ 
cal  size.  NewDistob  jTree  works  by  creating  the  constituent  Start  if  logDelta  is  zero  or  by  re¬ 
cursing  itself  on  the  two  halves  of  its  range  if  logDelta  is  positive. 

The  current  implementation  will  have  to  be  extended  on  a  larger  system  so  as  not  to  bottle¬ 
neck  node  0,  but  it  is  adequate  for  small  and  medium-range  systems. 
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4.4.  Summary 

The  Cosmos  operating  system  provides  the  software  extension  to  the  MDP  architecture 
needed  to  run  Concurrent  Smalltalk  programs.  The  operating  system  is  comprised  of  a  ker¬ 
nel  resident  on  each  MDP  and  a  set  of  Concurrent  Smalltalk  functions  written  in  either  MDP 
assembly  language  or  Concurrent  Smalltalk. 

The  operating  system  is  built  in  layers  which  include  the  heap  manager,  BRAT  manager, 
object  manager,  context  manager,  global  object  manager,  method  manager,  control  manager, 
utilities,  and  MDP  and  CST  runtime  systems.  Efficiency  and  re-entrancy  problems  were  re¬ 
curring  issues  in  the  design  of  the  operating  system  kernel.  The  criticality  system  was  de¬ 
veloped  to  deal  with  the  re-entrancy  and  double  faulting  problems.  In  addition,  many  rou¬ 
tines  are  inlined  in  other  routines  to  make  the  efficiency  reasonable  and  avoid  double  faults 
and  re-entrancy  problems  (in  some  cases  a  system  call  cannot  call  another  system  call  but 
can  use  it  inlined  because  there  are  no  more  free  data  registers  on  the  MDP;  global  variables 
cannot  be  used  as  temporaries  in  routines  running  at  criticality  less  than  2). 

The  operating  system  facilities  were  streamlined  and  simplified  compared  with  those  pro¬ 
posed  in  [38].  The  emphasis  was  on  making  resource  allocation  decisions  as  late  as  possible. 
Thus,  the  size  of  the  BRAT  is  varied  dynamically  at  run  time  instead  of  being  fixed  at  operat¬ 
ing  system  compile  time  as  in  [38].  The  object  migration  protocol  has  been  streamlined  com¬ 
pared  with  the  one  in  [38].  The  resource  wait  table  in  [38]  has  been  eliminated  entirely;  the 
BRAT  manager  is  a  general-purpose  mechanism  that  can  perform  the  same  task  better. 

Finally,  a  scheme  for  quickly  addressing  constituents  of  distributed  objects  was  designed. 
The  scheme  is  very  fast  and  requires  only  knowledge  of  a  group  ID  to  find  either  some  nearby 
constituent  or  any  given  constituent.  Disadvantages  of  the  scheme  include  the  necessity  of 
rounding  the  size  of  a  distributed  object  up  to  the  nearest  power  of  two  and  a  resulting  de¬ 
creased  load  on  the  higher-numbered  MDPs  in  the  J-Machine.  Means  of  circumventing  these 
disadvantages  were  explored. 
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Chapter  5.  Sample  Program 

This  chapter  presents  the  progress  of  a  simple  program  through  the  various  stages  of  compi¬ 
lation.  Unfortunately,  it  is  difficult  to  write  a  simple  sample  program  that  exercises  all  of  the 
features  of  a  compiler.  Instead  of  trying  to  write  a  contrived  sample  program  that  exercised 
as  many  features  as  possible,  I  decided  that  a  simpler  program  that  exercised  the  major  op¬ 
timizations  would  m^e  a  better  example.  If  an  illustration  of  a  more  esoteric  optimization  is 
desired,  one  can  write  an  appropriate  Concurrent  Smalltalk  program,  compile  it  with  Opti¬ 
mist  II,  and  watch  the  intermediate  output. 

The  source  program,  listed  in  Figure  5-1,  returns  the  sum  of  the  integers  from  0  to  n.  Figure 
5-2  shows  a  transcript  of  the  interactive  Optimist  II  session  in  which  the  program  was  en¬ 
tered,  tested  on  a  few  inputs,  and  then  compiled. 

(defmethod  average  integer  (b: integer) 

(//  (+  self  b)  2)) 

(defmethod  average  boolean  (b: boolean) 
false) 

(defmethod  rangesum  integer  (high) 

(if  (“  self  high) 
self 

(let  ((middle  (average  self  high))) 

(+  (rangesum  self  middle) 

(rangesum  (+  middle  1)  high))))) 

(defun  sum  (n) 

(rangesum  0  n) ) 

Figure  5-1.  The  Rangesum  Pro^am 

The  sum  function  adds  the  integers  from  0  to  n,  inclusive.  The  rangesum  method  adds  the  integers  from  self  to 
high,  inclusive.  The  average  method  returns  the  average  of  wto  integers:  the  definition  of  average  for  booleans 
was  included  just  to  confuse  the  compiler  a  bit. 

CST: (+  2  2) 

♦<Integer  4> 

CST: (includ«) 

#<Csc-Lambda  5024988  SUM> 

CST: (•uB  0) 
f<Integer  0> 

CST: («UB  1) 
l<Integer  1> 

CST: (SUB  2) 

•  <Int.eger  3> 

CST: (SUB  XO) 
f<Integer  55> 

CST: (svsrsgs  3  5} 

#<Integer  A> 

CST:  (svsrsgs  trus  fslss) 

•<ralse> 

CST: (sub  100) 

#<Integer  505C> 

CST ;  (rsrvgssw  10  13) 

•<Integer  46> 

CST:  (coapils  sub  **:  rfaet.'RsngssuB.adp'*) 

Optimizing  #<Cst.-LaiTiDda  4713968  CST::SUM> 

Expanded  cont Inuations 
Folded  constants 
Forwarded  replies 

Optimizing  •<Cst-Lambda  4711  636  CST: :  RANGESi;y,> 

*apsed  neon  cur  rent  lys 
Expanded  cont  1  ^’uat  Ions 
Specialized  local  type:. 

Deleted  moves 
Deleted  touches 
Folded  constants 

Optlm.izlng  #<Cst-Lambda  4709940  CST r  : AVERAGE> 

Expanded  cont i nuat ions 
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Specialized  local  types 
Deleted  locals 

Baclt  to  #<Cst-Lambda  4711636  CST: :  RANGESUM> 

Substituted  Inlines 
Specialized  local  types 
Deleted  moves 
Deleted  touches 
Propagated  values 
Deleted  dead  definitions 
Deleted  locals 

Baclc  to  i<Cst -Lambda  4713968  CST::SUM> 

Deleted  locals 
inserted  ENTER  and  EXIT 
Split  statements 
Optimized  built-ins 
Inserted  ENTER  and  EXIT 

Generating  code 

Assembling 
Initialized  vlocs 

Printing 
Assigned  labels 

Generating  code 

Assembl ing 
Inserted  branches 
Initialized  vlocs 
Compacted  SENDs 

Printing 
Assigned  labels 
#<Cst-Lambda  4713968  SUM> 

Figure  5-2.  Rangesum  Interactive  Session 

The  Rangesum  file  was  read  in  the  (include)  directive,  at  which  time  the  user  interactively  chose  the  file  name  us¬ 
ing  a  Macintosh  dialog.  A  few  functions  were  then  tested,  after  which  point  the  file  was  compiled 

The  following  sections  will  illustrate  the  actions  of  some  of  the  compiler’s  optimizations  on 
the  program  in  Figure  5-1.  Please  refer  to  Chapter  3  and  [21]  for  explanations  of  the  trans¬ 
formations. 

Initial  Phase 


The  initial  phase  of  the  compiler  first  performs  a  few  macro  expansions  on  the  input  pro¬ 
gram,  compiles  the  program  into  hcode,  and  then  performs  some  transformations  on  that 
hcode  to  get  it  into  a  form  that  the  rest  of  the  compiler  can  use.  Figure  5-3  shows  the 
macroexpansions  which  are  done  by  the  Optimist  II  parser,  and  Figure  5-4  shows  the  hcode 
produced  by  the  parser.  To  save  space,  only  the  transformations  on  the  rangesum  method 
will  be  shown  from  this  point  on. 

Optimization  Phase 

The  Optimist  II  optimization  phase  performs  local  and  global  optimizations  on  the  program. 
The  order  of  the  optimizations  can  be  seen  in  the  transcript  in  Figure  5-2;  the  compiler  often 
interrupts  the  optimization  of  one  function  to  optimize  another  because  it  wants  to  inline  the 
second  function  in  the  first. 

The  first  transformation  done  by  the  optimization  phase  is  the  collapsing  of  nconcurrentlys 
and  the  expansion  of  continuations  to  the  two-variable  format,  yielding  the  hcode  in  Figure  5- 
5.  The  threads  of  the  nconcurrenlly  are  inlined  in  the  function’s  main  body,  and  the  nconcur- 
rently  statement  is  removed.  Then,  since  an  MDP  continuation  is  actually  two  words  (a  con¬ 
text  ID  and  an  offset  within  that  context  where  the  return  value  should  be  stored),  each  con¬ 
tinuation  variable  is  replaced  by  two  variables. 
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Sample  Program 


(defmethod  rangesum  integer  (high) 

(if  (-  self  high) 
sel  f 

(let  ((middle  (average  self  high)}) 

(4  (rangesum  self  middle) 

(rangesum  (4  middle  1)  high})))} 

(DEFMETHOO  RANGESUM  INTEGER  (HIGH)  :  I  .‘OBJECT 
(IF  (-  SELF  HIGH) 

SELF 

(LET  {(MIDDLE  (AVERAGE  SELF  HIGH))) 

(4  (RANGESUM  SELF  MIDDLE)  (RANGESUM  (4  MIDDLE  1)  HIGH))))) 

(DEFMETHOD  RANGESUM  INTEGER  (HIGH) :: (CONTINUATION: • :OBJECT) 

(IF  (-  SELF  HIGH) 

SELF 

(LET  ((MIDDLE  (AVERAGE  SELF  HIGH))} 

(4  (RANGESUM  SELF  MIDDLE)  (RANGESUM  (4  MIDDLE  1}  HIGH))))} 


(BEGIN 

(DEFSELECTOR  RANGESUM) 

(ADD-METHOD  RANGESUM  INTEGER 

(METHOD-LAMBDA  INTEGER  (HIGH) :: (CONTINUATION: «: OBJECT)  4NAME  RANGESUM 
(IF  (-  SELF  HIGH) 

SELF 

(LET  ((MIDDLE  (AVERAGE  SELF  HIGH))) 

(4  (RANGESUM  SELF  MIDDLE)  (RANGESUM  (+  MIDDLE  1)  HIGH))))))) 

...  (LAMBDA  (SELF:INTEGER  HIGH)  ::  (CONTINUATION  :•  :OBJECT)  tNAME  RANGESUM 
(_WITH-OBJECT  (SELF: INTEGER) 

(IF  (-  SELF  HIGH) 

SELF 

(LET  ((MIDDLE  /AVERAGE  SELF  HIGH))) 

(4  (RANGESUM  SELF  MIDDLE)  (RANGESUM  (♦  MIDDLE  1)  HIGH)  >  )  )  )  )  ... 


Figure  5-3.  Rangesum  Macroexpansion 

The  rangesum  function  is  first  macroexpanded  through  two  macros  that  add  the  class  of  the  continuation  to  the 
defmethod  syntax  (see  Section  A. 5}.  Then  the  defmetnod  itself  is  expanded  into  a  combination  of  a  defse- 
lector  and  an  add-method  of  a  met  hod- lambda.  Later  the  method- 1  ambda  is  expanded  into  a  lambda. 


(LAMBDA  CST: :RANG£SUM 

(♦<Parameter  CST::S£Lr  •<P-Class  CST: : INTEGER> 

#<Parameter  CST::HIGH  i<s-Class  CST:  :OBJECT:>) 

(#<Parameter  CST: :CONTINUATION  l<Coni-Type  #<S-Class  CST: :OBJECT>>> ) 

0 

(({LOCAL  43S)  l<S-CIass  CST : : OBJ£CT>) 

((LOCAL  434)  «<S-Class  CST: :OBJECT>) 

((LOCAL  433)  f<S-ClabS  CST : : OBJECT>) 

((LOCAL  432)  #<S-Ciass  CST: :OBJECT>) 

((LOCAL  CST::MIDDLE)  f<S-Class  CST: :OBJECT>) 

((LOCAL  431)  MS-Class  CST;  :OBJECT>) 

((LOCAL  430)  #<S-Class  CST : : OBJ£CT>) 

((LOCAL  429)  f<S-Ciass  CST: :OBJECT>) 

((LOCAL  CST::SELF}  #<P-Class  CST: : INTEGER>) 

((LOCAL  CST::HIGH)  •<S-Ciass  CST : : OB JECT>) 

((LOCAL  CST; :CONTINUATION)  f<Cont-Type  f<S-Class  CST: :OBJECT>) ) 

(ASSERT-TYPE  •<P-Ciass  CST:  ;  INTEGER>  (LOCAL  CST::.5ELF)) 

(APPLY  ( (LOCAL  429)  ) 

(#<Bullt-In-SeIector  CST::->  (LOCAL  CST::SELF)  (LOCAL  CST::HIGH))) 

(IF  ;FALSE  (LOCAL  429)  25i8'7} 

(MOVE  (LOCAL  430)  (LOCAL  CST::SELF)) 

(JUMP  2611) 

(LABEL  2587) 

(APPLY  ((LOCAL  431))  ((GLOBAL  CST AVERAGE)  (LOCAL  CST:;SELF)  (LOCAL  CST::HIGH))) 

(MOVE  (LOCAL  CST;;MIDDLE}  (LOCAL  431)) 

(TOUCH  (LOCAL  CST MIDDLE ) ) 

(NCONCURRENTLY 
(  (  (APPLY  (  (LOCAL  433)  ) 

(#<Bui it-In-Selector  CST::->  (LOCAL  CST:;M:ddLE)  #<Integer  !>)) 

(APPLY  ((LOCAL  434))  ((GLOBAL  CST: :RANGESUM)  (LOCAL  433)  (LOCAL  CST : : H IGH) ) ) ) 

(  (APPLY  (  (LOCAL  432)  ) 

( (GLOBAL  CST; rRANGESUM)  (LOCAL  CST: :SELF)  (LOCAL  CST: zMIDDLE) ) ) ) ) ) 

(APPLY  ((LOCAL  435))  (#<Bui It-In-Selector  CST::*>  (LOCAL  432)  (LOCAL  434))) 

(MOVE  (LOCAL  430)  (LOCAL  435)) 

(LABEL  2611) 

(MOVE  (CCNT-REF  LOCAL  CST : ; CONT INUATION)  (LOCAL  430))) 

Figure  5-4.  Initial  Rangesum  Hcode 

This  hcode  is  the  final  output  ot  the  initial  phase.  The  lamtxfa  is  comprised  of  the  two  parameters  (self  and  high), 
a  return  (continuation),  no  display  parameters,  a  list  of  local  variables,  and  a  representation  of  the  hcode  digraph 


Next,  the  compiler  starts  the  iterative  optimizations.  The  first  successful  one  is  local  type 
specialization,  which  uses  type  dataflow  analysis  to  detect  the  fact  that  local  429  always 
holds  a  boolean  value,  so  it  changes  local  429’s  type  to  boolean. 
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(XAMBDA  CST::RANGESUM 

(icParaneter  CST::SELr  l<P>Class  CST: : lNT£GeR> 

«<Parameter  CST::HIGK  «<S-Class  CST: :OBJECT>) 

<#<Parameter  CST; : CONTINUATION  #<Cont-Type  #<5-01388  CST:  :OBJECT»>) 

0 

({(LOCAL  435)  »<S-Ciass  CST; ;0BJECT>) 

((LOCAL  434)  l<S-Class  CST; ;0BJECT>) 

((LOCAL  433)  #<S-Class  CST; ;0BJECT>) 

((LOCAL  432)  #<S-Class  CST: :OdJECT>) 

((LOCAL  CST::MIDDLE)  #<S-Class  CST: :OBJECT>) 

((LOCAL  431)  #<S-Class  CST: :OBJECT>) 

((LOCAL  430)  #<S-Class  CST; :0BJECT>) 

((LOCAL  429)  #<S-Ciass  CST: :OajECT>) 

((LOCAL  CST:: SELF)  #<P-Class  CST; ; INTEGER>) 

((LOCAL  CST;:HIGH)  #<S-Class  CST: :0BJECT>) 

((LOCAL  CST; :CONTINUATION)  #<Cont-Type  #<S-Class  CST: :OBJECT>) 

((LOCAL  CST: :CONTINUATION)  i<P-Class  CST: :CONTEXT>) 

((LOCAL  CST:  .'CONTINUATION)  »<Dlsp-Type  #<S-Class  CST:  :OBJECT>)  ) 

(ASSERT-TYPE  •<P-Class  CST : : INTEGER>  (LOCAL  CST: :SELF) ) 

(APPLY  ( (LOCAL  429) ) 

(#<Bullt-ln-Selector  CST:;->  (LOCAL  CST::SELF)  (LOCAL  CST::HIGH))) 

(IF  :FALSE  (LOCAL  429)  2587) 

(MOVE  (LOCAL  430)  (LOCAL  CST::SELF)) 

(JUMP  2611) 

(LABEL  2587) 

(APPLY  ((LOCAL  431))  (#<Selector  CST: :AVERAGE>  (LOCAL  CST::SELF)  (LOCAL  CST::HIGH))) 

(MOVE  (LOCAL  CST: :MIDDLE)  (LOCAL  431)) 

(TOUCH  (LOCAL  CST: :MIDDLE) ) 

(APPLY  ((LOCAL  433))  (•<Built-In-Selector  CST::4>  (LOCAL  CST: rMIDDLE)  #<Inceger  !>)) 

(APPLY  ((LOCAL  434))  (f<SelecCor  CST ; : RANGESUM>  (LOCAL  433)  (LOCAL  CST::HIGH))) 

(APPLY  ((LOCAL  432))  ( i<Se lector  CST RANGESUM>  (LOCAL  CST :: SELF)  (LOCAL  CST: : MIDDLE )) ) 

(APPLY  ((LOCAL  435) )  (i<Bui It-In-Selector  CST::+>  (LOCAL  432)  (LOCAL  434))) 

(MOVE  (LOCAL  430)  (LOCAL  435)) 

(LABEL  2611) 

(MOVE  (CONT-REF  (LOCAL  CST :: CONTINUATI ON)  (LOCAL  CST: :CONTINUATION) )  (LOCAL  430))) 

Figure  5-5,  Hcode  after  Initial  Transformations 

The  nconcurrently  statement  has  been  broken  into  its  threads,  and  two  variables  assigned  to  hold  the  continuation. 
The  two  new  continuation  variables  have  the  same  name  as  the  sir^le  old  continuation  variable,  which  is  still  pre¬ 
sent,  but  the  compiler  does  not  get  confused  over  variable  name  conflicts. 


(LAMBDA  CST; rRANGESUM 

(#<Parameter  CST::SELF  #<P-Class  CST : : INTEGER> 

#<Parameter  CST::HIGH  #<S-Class  CST; :OBJ£CT>) 

(#<Parameter  CST;  :CONTtNUATION  l<Cont-Type  #<S-Class  CST:  :OBJECT»>) 

0 

({(LOCAL  435)  «<S-Class  CST; :OBJECT>) 

((LOCAL  434}  #<S-Class  CST: :0BJECT>} 

((LOCAL  433)  #<S-Class  CST: ;0BJECT>) 

((LOCAL  432)  #<S-Class  CST: :OBJECT>) 

((LOCAL  CST:  .-MIDDLE)  #<S-CJass  CST;  :0BJECT>) 

((LOCAL  431)  #<S-Class  CST; :0BJECT>) 

((LOCAL  430)  l<S-Class  CST: :OBJECT>) 

((LOCAL  429)  f<P-Class  CST: ;BOOLEAN>) 

( (LOCAL  CST: :SELF)  •<S-Class  CST: :0BJECT>) 

((LOCAL  CST::H1GH)  #<S-Class  CST: ;OBJECT>) 

((LOCAL  CST: :CONTlNUATION)  #<Cont-Type  #<S-Class  CST: :OBJECT>) 

((LOCAL  CST:  .-CONTINUATION)  #<P-Class  CST:  ;CONTEXT>) 

((LOCAL  CST: :CONTINUATION)  »<Disp-Type  #<S-Class  CST: :OBJECT>) ) 

(ASSERT-TYPE  •<P-Class  CST : : INTEG£R>  (LOCAL  435)) 

(APPLY  ((LOCAL  429))  ( #<Bu i U- In-Se 1 ector  CST:;->  (LOCAL  435)  (LOCAL  CST::HIGH))) 

(IF  :FALSE  (LOCAL  429)  2587) 

(JUMP  2611) 

(LABEL  2587) 

(APPLY  ((LOCAL  431))  (^<Selector  CST; : AVERAGE>  (LOCAL  435)  (LOCAL  CST: :HIGH) ) ) 

(APPLY  ((LOCAL  433))  ( #<Bu 1 U- I n-Sel ect or  CST::+>  (LOCAL  431)  •<lnteqer  !>)} 

(APPLY  ((LOCAL  434))  (#<SeiecLor  CST : : RANGESUM>  (LOCAL  433)  (LOCAL  CST;:LIGH>)> 

(APPLY  ((LOCAL  432))  (#<Select.or  CST:  :  RANGES’JM>  (LOCAL  435)  (LOCAL  431))) 

(APPLY  ((LOCAL  435))  ( #<Bui U- I n-Sel ect or  CST::+>  (LOCAL  432)  (LOCAL  434))) 

(LABEL  2611) 

(MOVE  (CONT-REF  (LOCAL  CST :: CONTINUATI ON)  (LOCAL  CST :: CONTINUATION ) )  (LOCAL  435))) 

Figure  5-6.  Locally  Optimized  Hcode 

This  hcode  has  been  fully  optimized  using  the  optimizations  in  the  original  Optimist  compiler.  Note  that  due  to 
move  elimination  the  self  parameter  is  no  longer  stored  in  the  old  self  local:  instead,  a  ne\M  local  numbered  435  is 
now  used  to  hold  the  self  value. 


Afterwards,  the  standard  dataflow  optimizations  described  in  [21]  remove  a  few  moves  and  a 
touch  to  yield  the  hcode  in  Figure  5-6.  Then  the  constant  folder  realizes  through  type  infer¬ 
ence  that  only  one  possible  method  of  the  rangesum  and  average  selectors  could  be  called,  so 
it  replaces  the  method  calls  with  direct  function  calls  (Figure  5-7). 
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Sample  Program 


(LAMBDA  CST; :RANGESUM 

(•cParameter  CST::SELF  l<P-Ciass  CST: : 1NTECER> 
•<Parameter  CST::H:gh  #<$-01388  CST: :OBJECT>) 


(#<Parameter  CST: iCONTINUATION  #<Cont-Type  #<S-Class  CST: :OBJECT>>>) 

0 

(((LOCAL  435)  f<S-Cla8S  CST: :OBJECT>) 

((LOCAL  434)  #<S-Class  CST : : OBJECT>) 

((LOCAL  433)  •<S-CUs8  CST : :  OBJECT>) 

((LOCAL  432)  *<S-Class  CST : : OBJECT>) 

((LOCAL  CST::MIDDLE)  •<S-Class  CST : : OBJECT>) 

((LOCAL  431)  •<S-Class  CST: :OBJECT>) 

{(LOCAL  430)  #<S-Class  CST : : OB JECT>> 

((LOCAL  429)  •<P-Class  CST: : BOOLEAN>) 

((LOCAL  CST::SELF)  f<S-Clas8  CST : :OBJECT>) 

((LOCAL  CST::HIGH)  •<S-Class  CST: :OBJECT>) 

((LOCAL  CST: :CONTINUATION)  #<Cont-Type  #<S-Class  CST: :OBJECT>) 

((LOCAL  CST: :CONTINUATION)  #<P-Class  CST: :CONTEXT>) 

((LOCAL  CST: :CONTINUATION)  #<Disp-Type  #<$-01388  CST: :OBJECT>) ) 

(ASSERT-TYPE  #<P-Class  CST: :INTEGER>  (LOCAL  435)) 

(APPLY  ((LOCAL  429))  (#<BulU-In-Selector  CST::->  (LOCAL  435)  (LOCAL  CST:  :HIGH)  )  ) 
(IF  :FALSE  (LOCAL  429)  2587) 

(JUMP  2611) 

(LABEL  2587) 

(APPLY  ((LOCAL  431)) 

(APPLY  ((LOCAL  433)) 

(APPLY  ((LOCAL  434)) 

(APPLY  ( (LOCAL  432) ) 

(APPLY  ((LOCAL  435)) 

(LABEL  2611) 

(MOVE  (CONT-REF  (LOCAL  CST: iCONTINUATION)  (LOCAL  CST: iCONTINUATION) )  (LOCAL  435))} 


((LAMBDA  CST;: AVERAGE)  (LOCAL  435)  (LOCAL  CST: :HIGH) ) ) 
(#<Built-In-Selector  CST::+>  (LOCAL  431)  #<Integer  1>) ) 
((LAMBDA  CST::RANGESUM>  (LOCAL  433)  (LOCAL  CST::HIGH))) 
((LAMBDA  CSTllRANGESUM)  (LOCAL  435)  (LOCAL  431))) 
(#<Bullc-In-Selector  CST::+>  (LOCAL  432)  (LOCAL  434))) 


Figure  5-7.  Hcode  after  Global  Constant  Propagation 

The  constant  propagator  found  that  the  average  and  rangesum  method  calls  would  always  invoke  the  same  meth¬ 
ods.  so  it  replaced  them  with  function  calls. 


(LAMBDA  CST::AVERAGE 

(•<Parameter  CST::SELF  •<P-Class  CST: : INTEGER> 

#<Parameter  CST::B  #<P-Class  CST: : INTEGER>) 

(«<Parameter  CST:  iCONTINUATION  #<Cont-Type  #<$-01388  CST:  :OBJECT>») 

0 

(((LOCAL  424)  #<P-Class  CST: : INTEGER>) 

((LOCAL  423)  #<P-Class  CST: : INTEG£R>) 

((LOCAL  CSTiiSELF)  #<P-Class  CST: : INTEGER>) 

((LOCAL  CSTliB)  #<P-Class  CST: :INTEGEft>) 

((LOCAL  CST; ICONTINUATION)  #<P-Ciass  CST: :CONTEXT>) 

((LOCAL  CST: iCONTINUATION)  #<01sp-Type  #<S-CJass  CST: :OBJ£CT>) ) 

(ASSERT-TYPE  #<P-Class  CST: :INTEGER>  (LOCAL  CST: : SELF) ) 

(APPLY  ((LOCAL  423))  ( « <Bui 1 1 -I n-Se iect or  CST : : ♦>  (LOCAL  CST: iSELF)  (LOCAL  CST: :B) ) ) 

(APPLY  ((LOCAL  424))  (I <Bui U -I n-Select or  CST:://>  (LOCAL  423)  #<Integer  2>)) 

(MOVE  (CONT-REF  (LOCAL  CST: iCONTINUATION)  (LOCAL  CST: iCONTINUATION) )  (LOCAL  424))) 

Figure  5-8.  Optimized  Average  Hcode 

The  average  method  for  integers  has  been  optimized  in  an  attempt  to  inline  it  inside  rangesum.. 


Next,  the  optimizer  attempts  to  inline  the  average  and  rangesum  functions.  Due  to  the  an¬ 
tirecursion  restrictions,  it  cannot  inline  rangesum  inside  itself,  but  it  is  more  successful  with 
average.  In  order  to  inline  average,  it  first  optimizes  it,  yielding  the  hcode  in  Figure  5-8. 
Then  it  checks  that  the  inlining  heuristics  are  satisfied — they  are  because  the  optimized  av¬ 
erage  contains  only  two  primitive  calls.  Average  does  not  perform  any  computation  after  it 
replies,  so  all  of  the  requirements  for  inlining  have  been  satisfied.  Therefore,  the  optimizer 
inlines  average  inside  rangesum  to  produce  the  hcode  in  Figure  5-9,  which  is  optimized  to  the 
hcode  in  Figure  5-10  at  end  of  the  general  optimizations. 
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(LAMBDA  CST: :RANGESUM 

(#<Parameter  CST;:SELr  #<P-Ciass  CST ; : INTEGER> 

•<Parameter  CST;; HIGH  #<S-Ciass  CST; ;OBJECT>) 

{•<Parameter  CST; ;CONTINUATION  •<Cont-Type  •<S-Class  CST: ;OBJECT>>>) 

0 

(((LOCAL  435)  #<5-01358  CST; ;OBJECT>) 

((LOCAL  434)  icS-Class  CST; ;OBJECT>) 

((LOCAL  433)  #<S-Class  CST; ;OBJECT>) 

((LOCAL  432)  #<S-Class  CST: ;OBJECT>) 

((LOCAL  431)  #<S-Class  CST; ;0BJECT>) 

{(LOCAL  429)  l<P-Ciass  CST: ;BOOLEAN>) 

((LOCAL  CST:: HIGH)  «<S-Class  CST: ;OBJECT>) 

((L<r-AL  CST; :CONTINUATION)  #<P-Class  CST: ;CONTEXT>) 

((LOCAL  CST: ;CONTINUATTON)  i<Oisp-Type  #<S-Class  CST; :OBJECT>) 

((LOCAL  424)  f<P-Class  CST: :INTEGER>) 

((LOCAL  423)  #<P-Class  CST; : INTEGER>) 

{(LOCAL  CST::SELF)  •<P-Class  CST: ; INTEGER>) 

((LOCAL  CST;;B>  •<P-Class  CST; ; 1NTEGER>) 

((LOCAL  455)  #<S-caass  CST; ;OBJECT>) ) 

(ASSERT-TYPE  #<P-Class  CST ; ; INTEGER>  (LOCAL  435)) 

(APPLY  ((LOCAL  429))  (i<Bullt-In-Selector  CST: ;->  (LOCAL  435)  (LOCAL  CST: :HIGH) ) ) 

(IF  :FALSE  (LOCAL  429)  2979) 

(JUMP  2611) 

(LABEL  2979) 

(MOVE  (LOCAL  CST: :B)  (LOCAL  CST: :HIGH) ) 

(MOVE  (LOCAL  CST::SELF)  (LOCAL  435)) 

(TOUCH  (LOCAL  CST: :B) ) 

(TOUCH  (LOCAL  CST: :SELF) ) 

(ASSERT-TYPE  •<P-Class  CST ; ; INTEGER>  (LOCAL  CST::SELF)) 

(APPLY  ((LOCAL  423))  ( #<Bui 1 t-In-Sel ector  CS7::*>  (LOCAL  CST::SELF)  (LOCAL  CST::B))) 

(APPLY  ((LOCAL  424))  ( #<Bu 1 1 t- I n-Sel ector  CST:://>  (LOCAL  423)  #<Integer  2>) ) 

(MOVE  (LOCAL  455)  (LOCAL  424)) 

(MOVE  (LOCAL  431)  (LOCAL  455)) 

(APPLY  ((LOCAL  433))  (#<Buiit-In-Selector  CST;:-*>  (LOCAL  431)  #<Inceger  !>)) 

(APPLY  ((LOCAL  434))  ((LAMBDA  CST : : RANGESUM)  (LOCAL  433)  (LOCAL  CST::H1GK))) 

(APPLY  ((LOCAL  432))  ((LAMBDA  CST: : RANGESUM)  (LOCAL  435)  (LOCAL  431))} 

(APPLY  ((LOCAL  435))  ( i<Bul  1 1- 1 n-Seiector  CST::*»>  (LOCAL  432)  (LOCAL  434  ))) 

(LABEL  2611) 

(MOVE  (CONT-REF  (LOCAL  CST: : CONTINUATION)  (LOCAL  CST: :CONTINUATICN) )  (LOCAL  435))) 

Figure  5-9.  Rangesum  with  Average  Inlined 

The  integer  average  method  has  just  been  inlined  into  rangesum. 


(LAMBDA  CST: : RANGESUM 

(#<Parameter  CST::SELF  #<P-Class  CST : ; INTEGER> 

#<Parameter  CST::HIGH  l<S-Class  CST : : OBJECT>) 

(#<ParaTneter  CST:  :CONTINUATION  #<Cont-Type  l<S-Class  CST:  :OBJECT»>) 

0 

(((LOCAL  435)  •<S-Class  CSl . : OBJECT>) 

((LOCAL  434)  #<S-Class  CST: :08JECT>) 

((LOCAL  433)  •<P-Class  CST: : INTEGER>) 

((LOCAL  432)  #<S-Class  CST: :O0JECT>) 

((LOCAL  429)  #<P-Class  CST: ; BOOLEAN>) 

((LOCAL  CST::HIGH)  #<S-Class  CST: :OBJECT>) 

((LOCAL  CST: jCONTINUATION)  •<P-Class  CST: :CONTEXT>) 

(  (LOCAL  CST:  :C0NT1NUAT10N)  •<Dlsp-Type  #<S-CUss  CST:  :OBJECT>) 

((LOCAL  424)  »<P-Class  CST: : :NTEGER>) 

((LOCAL  423)  •<P-Class  CST ; ; I NTEGER? ) ) 

(ASSERT-TYPE  #<P-c:ass  CST: : INTEGER>  (LOCAL  435)) 

(APPLY  ((LOCAL  429))  ( #<Bu i It- In-Sel ector  CST; ; •>  (LOCAL  435)  (LOCAL  CST :: HIGH) ) ) 

(IF  :FALSE  (LOCAL  429)  2541) 

(JUMP  2611) 

(LABEL  2541) 

(ASSERT-TYPE  #<P-Class  CST : : INTEGER>  (LOCAL  435)) 

(APPLY  ((LOCAL  423))  ( #<601 1 1- 1  n-Se  1  ector  CST:  :•»  >  (LOCAL  435)  (LOCAL  CST ::  HIGH)  )  ) 

(APPLY  ((LOCAL  424))  ( f<Bu  1 J t - In-Sel ector  CST:://>  (LOCAL  423)  ICnteger  2>)) 

(APPLY  ((LOCAL  433))  ( #<Bui 1 t- 1 n-Se 1 ector  CST:;+>  (LOCAL  424)  •<Integer  1>)) 

(APPLY  ((LOCAL  434))  ((LAMBDA  CST RANGESUM)  (LOCAL  433)  (LOCAL  CST;  .-HIGH))) 

(APPLY  ((LOCAL  432))  ((LAMBDA  CST :: RANGESUM)  (LOCAL  435)  (LOCAL  424))) 

(APPLY  ((LOCAL  435))  ( l<Bu i It - 1 n-Sel ector  CST::*>  (LOCAL  432)  (LOCAL  434))) 

(LABEL  2611) 

(MOVE  (CONT-REF  (LOCAL  CST  ::  CONTI  NUATI  ON)  (LOCAL  CST  :  :  CONT I  N'JAT  I  ON)  )  (LOCAL  435))) 

Figure  5-10.  Rangesum  after  General  Optimizations 

The  rangesum  hcode  is  now  at  the  “Optimized  Hcode'  stage  in  Figure  3-4 


The  MDP-specific  optimizations  remove  the  assert-type  hcode,  reduce  the  division  to  a  shift, 
and  insert  enter  and  exit  hcodes  to  yield  the  final  hcode  in  Figure  5-11. 
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Sample  Program 


{LAMBDA  CST: :RANGESUM 

(•<Pardmeter  CST::SELF  #<P-Cldss  CST: :INTEGER> 
#<Parameter  CST::HIGH  •<S-Class  CST: :OBJECT>) 


(#<Parameter  CST: :CONTINUATION  #<Cont-Type  •<S-Class  CST: :OBJECT>») 

0 

({(LOCAL  435)  #<S“Class  CST: :OBJECT>) 

({LOCAL  434)  KS-CliSS  CS-^ : :  OBJECT>) 

((LOCAL  433)  #<P-Class  CS’.  : :  INTEGER> ) 

((LOCAL  432)  f<S-Class  CS7 : : OBJECT>) 

((LOCAL  429)  •<P-Class  CST : : BOOL£AN>) 

((LOCAL  CST::HIGH)  •<S-Class  CST : : OBJECT>) 

((LOCAL  CST: :CONTINUATION)  •<P-Class  CST: :CONTEXT>) 

((LOCAL  CST: ;C0NTINUAT10N)  #<Dlsp-Type  CST: :OBJECT>) 

((LOCAL  424)  #<P-Class  CST : : INTEGER>) 

((LOCAL  423)  #<P-Cla3S  CST : : INTEGER>) ) 

(ENTER) 

(APPLY  ((LOCAL  429))  (• <Bui It -I n-Seiect or  CST: 

(IF  :FALSE  (LOCAL  429)  2547) 

(JUMP  2611) 

(LABEL  2547) 

(APPLY  ( (LOCAL  423) ) 

(APPLY  ( (LOCAL  424) ) 

(APPLY  ( (LOCAL  433) ) 

(APPLY  ( (LOCAL  434) ) 

(APPLY  ((LOCAL  432)) 

(APPLY  ((LOCAL  435)) 

(LABEL  2611} 

(MOVE  (CONT-REF  (LOCAL  CST: :CONTiNUATION) 

(EXIT) ) 


(LOCAL  435)  (LOCAL  CST::HIGH))) 


(l<Built-In-Selector  CST: 

(#<Bui it-In-Selector  CST: 
(i<Bullt-In-Selector  CST: 

((LAMBDA  CST: : RANGESUM)  (LOCAL  433) 
((LAMBDA  CST: : RANGESUM)  (LOCAL  435) 
(•<BulU-In-Selector  CST::+>  (LOCAL 


■♦>  (LOCAL  435)  (LOCAL  CST::HIGH))) 
ASH>  (LOCAL  423)  •<lnteger  -!>)) 

+>  (LOCAL  424)  icinteger  1>)) 
(LOCAL  CST::HIGH) ) ) 
(LOCAL  424) ) ) 

432}  (LOCAL  434)  )  ) 


(LOCAL  CST: :CONTINUATION) )  (LOCAL  435)) 


Figure  5-11.  Final  Hcode 

This  is  the  final  hcode  produced  before  it  is  compiled  into  MDP  assembly  language 


Compilation  Phase 

The  compilation  phase  compiles  each  hcode  in  Figure  5-11  into  MDP  assembly  instructions 
and  then  peephole-optimizes  and  emits  the  resulting  code  to  produce  the  MDPSim  file  in 
Figure  5-12.  There  is  no  need  to  describe  the  transformations  here,  as  an  appropriate  exam¬ 
ple  is  in  [21], 

The  definitions  of  the  label  numbers  in  Figure  5-12  contain  expressions  of  the  form  LABEL 
cOb  ject=  (5SmX)  «sX  I  (5&my)<<sy|  (5&mZ)«sZl  (5im3)<<s3l  (5&m4)«s4|  (5&m5)« 
s5.  This  expression  means  that  cOb  ject  is  class  with  serial  number  5.  Nevertheless,  since 
objects  should  be  distributed  throughout  the  J-Machine,  the  bits  in  the  class  serial  number  5 
are  permuted  to  map  the  low-order  bits  onto  the  bits  denoting  the  x,  y,  and  z  network  coordi¬ 
nates  of  an  object.  This  is  done  by  the  first  half  of  the  expression, 
(5&mX)  «sX  I  (5&my)  «sy  I  ( 5&mZ)  <<sZ;  mX,  my,  mZ,  sX,  sy,  and  sZ  are  constants  defined 
by  the  operating  system  and  depend  on  the  dimensions  of  the  J-Machine.  The  second  half  of 
the  expression,  (5&m3)<<s31  (5&m4)«s4|  (5&m5)  <<s  5,  maps  the  rest  of  the  class  serial 
number  bits  onto  the  remaining  bits.  A  similar  expression,  REF  REV  fSum=lD;  (- 
2&mX)«sX|  (-2&my)<<syi  (-2&mZ)«sZ|  (-2&mS)  «sS,  is  used  to  map  objects  onto  nodes. 

LABEL  COb  ject  -  (54mX)  <<sX  I  (54inY)  <<sy  I  Ibtml)  «sZ  i  (54ni3)<<s3l  154x4)  <<s4  I  (54x5)  <<s5 
LABEL  cClass-(84mX)<<sXI  (84mY)<<syi  (84mZ)  <<sZ|  (84rT!3)<<s3f  (e4iT’4)<<s4(  (84m5)<<s5 
LABEL  cSt.andard_Cl  ass-  (34mX)  <<sX  1  (34mY)  <<sY  I  (34fnZ)  <<sZl  (34m3)<<s3i  (34m4)<-:s4l  (34m5)<«:s5 
LABEL  cPrimi t i ve_Class- (24mX) <<sX i  (24mY) <<sY!(24mZ)<<sZl  (24m3)<<s3l  (24m4)<<s4l  (24m5)<<s5 
LABEL  cDistrlbuted_Class- (44mX)<<sXI  (44mY)<<sY|  (44niZ)<<sZl  (44r.3)<<s3l  (44m4)<<s4l  (4  4it»5)  <<:s5 
.  BEL  cSymbol- (74mX) <<sX I  (74rpY)<<sYl  (74mZ)<<sZl  (74m3)<<s3l  (74m4)<<s4|  (74m5)<<s5 
LABEL  cNul  1-  (64mX)  «sX  |  (64fnY)  <<sY  [  (64rr.2)  <<sZ  I  (64rr.3)  <<s3  I  (64nf4)  <cs4  I  (64m5)  <<s5 
LABEL  cFunct-  (174mX}  «sX  1  (174mY)  <<sY  i  (174niZ)  «sZ  i  174m3)  «s3  l  (174m4)  «s4  i  (174m5)  <<s5 
LABEL  cSe lector-  (94mX)  <<SX  |  (94my)  <<sY  !  (94rrZ)  «sZ  1  (94m3)  <<s3  {94m4}  <<s4  .  (94 mb)  <<s5 
LABEL  cMagnitude-  (184mX)  <<sX  l  (184mY)  <<sY:  (184mZ)  <<sZ  I  (184rr3)  <<:s3l  (184m4)<cs4i  (184m5)<<s5 
LABEL  cCharacter-  (104mX)  <<sX  t  {lC4mY)  «sY  l  (lC4rriZ)  <<sZ  i  (104Tr.3)  «s3  t  (104m4)  <<s4  I  (104m5)  <<s5 
LABEL  cN umber-  (1  94mX)  <<sX !  (1  94mY)  <<sy  I  (1 94mZ)  <<sZ  i  (1  94m3)  <<s3  i  (1  94m'* )  <<s4  i  (1  94m5)  <<s5 
LABEL  cReal- (204mX) <<sXl  (204mY)<<sY!  (?04mZ)<<s2i  (2C4m3)<<s3i  (204m4)<<s4i  {2C4m5)<<s5 
LABEL  cInteger-(n4mX)<<sXl  (n4mY)«sy:  (ll4mZ)<<sZi  (i;4m3)«s3!  (Il4m4)<<s4  ,  (n4m5)<<s5 
LABEL  cBoolean-  (124mX>  «sX  |  (:24mY)  «sY  I  ( :24mZ)  «sZ  i  (124fr3)  ^<s3  I  (124m4)  <<s4  i  (124m5)  <<s5 
LABEL  cFal  se-  (1 34mX)  <<sXi  (134mY)<<sYi  (i34mZ)<<sZ;  (134rr3)<<s3;  {134m4)<<s4'  (■4  34m5)<<s5 
LABEL  cTrue- (1  44mX)<<sX  I  {l44mY)<<syi  (1  44fr.Z)  <<sZ  I  (144m3)<<s3i  (144m4)<<s4!  (144m5)<<s5 
LABEL  cFloat-(154mX)<<sXI  (154mY)<<sYi  (154mZ)  «.sZ  (154m3)  <<s3i  (154iti4)<<s4i  (154m5)«sb 
LABEL  cFunct  ion- (1  64mX)  <<sX  I  (164mY)<<sYl  (164mZ)<<sZi  (1 64rr;3)<<s3  ;  (I64m4)<cs4l  {164m5)<<sb 
LABEL  c_Closure-  (2'i4mX)<<sXi  (2l4my)<<sY|  (2l4m2)<<sZ:  (?A4m3)<<s3!  (2l4m4)<<s4i  (2l4m5)<<s5 
LABEL  cContext - (224mX) <<sX I  (224my) <<sy ,  (224mZ) <<sZ  >  (224m3) <<s3 i  (224m4) <<s4 I  (224m5) <<s5 
LABEL  cDisplacement - (234mX) <<sX '  (234my)<<5Y  (234mZ)<<sZi  (234m3)<<s3i  (234m4)<<s4l  {234m5)<<s5 
LABEL  cCont Inuat ion-  (2  44mX) <<sX i  (24  4my)<<sY '  (244m2) <<sZ ‘  (2  44m3) <<53  I  (24  4m4 ) <<s4 I  (2 44m5 ) <<s5 
LABEL  cGlobal- (254mX)  <<sX  ;  (254nY)<<sY:  (254r.Z )  <<sZ  I  (254m3)<<s3i  (2b4m4)<<s4i  (254m5)<<s5 
LABEL  cDistob)- {264mX)  <<sX  I  (264my)<<syj  (264mZ)<<s2!  (264ir3)  <<s3  }  (264m4)<<s4j  (264m5)<<s5 
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RCF  REV  selPLUS-TAGO:  subSEL<<subtagN  i  (O&mX)  <<sX  i  iO&mY)  «sy  i  (O&mZ)  «sZ  I  (0fcm3)  <<s3  !  (04m4)  <<s4  I  (Oimb)  <<s5 
REP  REV  seXEQUAL-TAGO:subS£L«subtagN|  (l4mX)«sXl  (l4my)<<sy!  (l4mZ)«sZ|  (I4m3}<<s3i  (l4Tn4}«s4|  (l4mS}«sS 
REF  REV  selAsh-TAG0:sub5EL<<subiagN  !  (24mX)<<sX|  (24mY)«8Y|  {24mZ)<<5Z|  (24m3)«s3l  (24m4)<<s4|  (24m5}<<sS 
REF  REV  Rangesum-ID: {-l4mX) <<sX I  (-l4mY)<<sYi  (-l4m2)<<sZl  {-i4mS)<<sS 
REF  REV  fSum-ID:  (-2<mX)  «sX|  {-24mY)  «sY|  {-2imZ)  «sZ  \  (-24mS)  «sS 


MODULE  cObject 

DC  MSG:hdrCopyabXe  I  cStandard  CXass«of  f setN I  5 

DC  TAGO: subCLASS<<6ubtagN } cOBject 

DC  MSG:cObjecL«offsetNl2 

DC  1 

DC  TAGO;subCIASS«subtagN|  cObject 

END 

MODULE  cClass 

DC  MSG;hdrCopyable| cPrimlt lve_Class<<of f setN 1 6 

DC  TAGO:subCLASS<<subt.agNl  cClass 

PC  NIL 

DC  2 

DC  TAGO;subCLASS<<subtagNicClass 

DC  TAGO; subCLASS<<subtagN I cOb)ect 

END 

MODULE 
DC 
DC 
DC 
DC 
DC 
DC 
DC 
END 

...  MODULES  for  the  rest  of  the  classes  deleted  ... 


MODULE  selPLUS 

DC  MSG : hdrCopyaole icSOiecior<<of iseiN  3 

DC  iSelPLUS^ 

DC  0 

END 

MODULE  selEOUAL 

DC  MSG : hdf Copyable I cSe lector<<of  fset  Ni 3 

DC  IseiEQUALl 

DC  0 

END 

MODULE  seiAsh 

DC  MSG :hdrCopy able icSe*ector<<of IsetNi  3 

DC  IselAsh) 

DC  C 

END 


cStandard_CXass 

MSGihdrCopyable I cPrimit lve_Ciass<<of f secN I  7 
TAGO: subCLASS<<subtagN | cStandard^Class 
NIL 
3 

TAGO :  subCLASS<<subt  aqN  !  cSLar'.dard_Class 
TAGO: subCLASS<<subtagN{cClass 
TAGO :  subCLASS<<subi  aqN  i  cOb'Ject 


MODULE 

f Rangesum 

DC 

MSG:  hdr  Copy  able  i  cF  unct  ior.<<of  f 

DC 

{  f Range sum) 

DC 

6 

MOVE 

12,A3] ,R0 

3 

MOVE 

f2,A3j , R3 

3,b 

EQUAL 

R3,  (3, A3] , RI 

4 

BT 

Rl, "LOOl 

4.5 

ADD 

R3,  13,A3j,R: 

5 

ASH 

R1,-1,R3 

5.5 

ADD 

R3,  1, R2 

6 

MOVE 

R2,  RO 

6.5 

CALL 

Ob jectNode 

7 

DC 

MSG  :msgAppl  yFunct  ior.  6 

8 

SEND20 

r:  ,  RO 

9 

DC 

i f Rangesum ) 

1  0 

SEND20 

RO,  R2 

' 

SENDO 

[3,  A3t 

■’1  ^ 

MOVE 

6,  RO 

2 

SEND2E0 

I ,  a;  ; ,  RO 

12.5 

WTAG 

RC, 6, RC 

•3 

MOVE 

Rc,  :6,a:  ; 

13.5 

MOVE 

12,  A3/, RO 

:  4 

CALL 

Ob jectNode 

14.5 

DC 

MSG:rnsq  Apply  fund  ion;  t 

:  5 

SEND2C 

R:  ,  RO 

1  6 

DC 

) f Rangesum | 

*  7 

SENDO 

RO 

18 

SEND2C 

'  2,  A3 ’ , R3 

18.5 

MOVE 

7,  RO 

1  9 

SEND2E0 

: I  /  All , RC 

19.5 

WTAG 

RC,  6, RO 

20 

MOVE 

RC,  (7, a; } 

2C.5 

MOVE 

[ 7,  A1 1 , R2 

21 

ADD 

R2,  [6,Ai ; ,RI 

21.5 

MOVE 

Rl, (2,A3i 

22 

Chartf'r  5 


Sample  Program 


LOOi : 

MOVE 

[4,A31 , R2 

23 

SNIL 

R2,  ■'L002 

23.5 

DC 

MSGimsgRep'y 14 

24 

SEND20 

R2,  RO 

25 

SENDO 

R2 

25.5 

SENDO 

15, A3] 

26 

SENDEO 

[2,A3] 

26.5 

L002; 

SUSPEND 

END 

27 

MODULE 

/Sum 

DC 

MSG: hdrCopyable 1 cFunct ion<<of  f setN 

DC 

( fSum) 

DC 

5 

MOVE 

0,  RO 

3 

CALL 

Ob ject Node 

3.5 

DC 

MSG : ms gApply Function l 6 

4 

SEND20 

R1,R0 

5 

DC 

(fRangesuml 

6 

SENDO 

RO 

7 

SENDO 

0 

7.5 

SENDO 

:2,A31 

6 

SENDO 

(3, A3] 

8.5 

SENDEJ 

(4, A3] 

9 

SUSPEND 

END 

9.5 

DOWN LOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWN  LOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOW:,  LOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 
DOWNLOAD 

download 

DOWNLOAD 

DOWNLOAD 

DOWNLOAD 

DOWNLOAD 


cOb ject 
cClass 

cStandard_Class 

cPrimlt ive_Class 

cDi St ri bated  Class 

cSymbol 

cNul  1 

cfjnct 

cSelector 

cMaqnit  aae 

cCharactcr 

cNumber 

cReal 

cinteger 

cBool ean 

cFalse 

cTrue 

cPloat 

cFunction 

c_Ciosure 

cContext 

cDi splacement 

cCont inaat ior 

colobal 

cDistob j 

selPLUS 

selEOL'AL 

sel Ash 

f Rangesam 

f  Sum 


RUN 

Figure  5-12  MDPSim  Output  File 

Except  for  Cosmos,  this  file  contains  all  code  and  data  necessary  to  run  sum  on  a  J-Machine.  The  file  starts  with 
dass  number  definitions,  which  are  followed  b>  definitions  of  the  dasses  themselves,  including  the  class  hierarchy. 
Thu  selectors  are  defined  next,  followed  by  code  and  MDPSim  statements  that  download  ail  of  the  cocfe,  selector, 
and  class  modules  to  the  simulated  J-Machine.  T'.'^  RUN  command  runs  the  J-Machine  until  all  modules  have 
been  loaded. 

Only  the  functions  and  selectors  necessary  to  run  the  program  have  been  compiled  For  example,  neither  average 
method  has  be'  .  induded  because,  after  optimization,  neither  is  necessary  to  run  sum.  Similarly,  all  method  dis¬ 
patches  have  been  optimized  out.  so  there  is  no  need  to  include  the  definition  of  the  rangesum  selector. 


Running  Rangesum 

Before  rangesum  can  be  run  on  MDPSim,  a  file  holding  the  calls  that  will  be  done  needs  to  be 
defined;  the  file  that  was  used  is  shown  in  Figure  5-13.  Each  MESSAGE  directive  defines  an 
ApplyFunction  message  that  can  be  used  to  call  the  sum  function.  The  argument  is  the 
third  word  of  the  message,  while  the  fourth  and  fifth  v  ;,rds  contain  a  magic  continuation  that 
cause  the  Reply  message  to  be  printed  by  MDPSim  in  the  listener  window.  The  MESSAGE 
definitions  can  also  be  entered  into  MDPSim  manually. 
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Once  the  calls  file  is  written,  MDPSim  can  be  started  and  used  to  run  sum  on  a  sample  input. 
An  example  session  is  shown  in  Figure  5-14,  in  which  the  input  10  is  tried  on  sum,  and  the 
statistics  observed.  The  results  will  be  discussed  in  more  detail  in  Chapter  7. 

MESSAGE  suml 
MSG:msgApplyFunction I  5 
{ fSum} 

1 

lONODE 

0 

END 

MESSAGE  sumlO 
MSG.’msgApplyFunction  I  5 
{ fSum) 

10 

lONODE 

0 

END 

MESSAGE  sumSO 
MSGimsgApplyFunction I  5 
{ fSum) 

50 

lONODE 

0 

END 

Figure  5-13.  R^^ngesum  Call  File 

Three  messages  have  been  defined  for  calling  the  sum  function  with  the  arguments  1,  10,  and  50.  ionode  is  an 
integer  constant  predefined  by  MDPSim  and  denotes  the  address  of  the  MOP  serving  as  the  I/O  node  between  the 
J-Machine  and  the  outside  world.  In  MDPSim,  the  I/O  node  simply  prints  every  message  it  receives. 

ICDPSia  2  •y  2  0x1000  :  :Co«aos:CoMo«.a  SUng«Sua.BKip  IUng»SuB. calls 

Message-Driven  Processor  Simulator 
Version  7.0  Rev  0 

Accompanies  MDP  Ar'.*hUecture  Document  IIB 
Written  by  Waldemar  Horwat 

Architecture  Updates  by  Brian  Totty  and  Jerry  Larivee 
UROPs  for  Bill  Dally 

4  MDPs  present. 

e0..3)watcb  fault  all 

90. . 3 1raaatatata 

90.. 3)inj«ct  auAlOSl 

90. . 3 1run 


Fault 

t 

\ 

(faultXlateC) 

Fault 

f 

{BBBW}  $008B  - 

DC 

{ ItXLATL 

;XLATE 

Fault 

9 

{looKupBi ndlng) 

Fault 

9 

IBBBW)  SO0C6  • 

DC 

f ItLookupBinding 

;  $0  6 

Fault 

9 

(enterBlndlng) 

Fault 

9 

iBBBWj  SOOC5  - 

DC 

f it EnterBi ndl ng 

;S0  5 

Fault 

s 

2 

(blockSend) 

Fault 

9 

2 

jBBBWI  S00C2  • 

DC 

i ItBlockSend 

;  $0  2 

Fault 

0 

2 

(faultLlmitO) 

Fault 

9 

2 

IBBBWI  $0088  • 

DC 

JitLimit 

; e:m:t 

Fault 

9 

(a  1  locOb ject ) 

Fault 

9 

IBBBWI  $C0C4  - 

DC 

f It A1 .ocCbiect 

;  SC4 

Fault 

9 

( lookupB; nd ; rg) 

Fault 

9 

IBBBWI  $00C6  - 

DC 

f It  LookupBi nd i ng 

;  S06 

Fault 

9 

(blockMove) 

Fault 

9 

IBBBWI  SCOCl  - 

DC 

f 1 1 B. ockMove 

;  SO  1 

Fault 

9 

(fauitl.imItC) 

9 

iBBBWi  $0088  - 

DC 

^  >  r  *  i  > 

c :  M :  T 

Fault 

9 

(ob  )ec: Node ) 

Fault 

9 

iBBbW-  $00D3  - 

DC 

f  1 1  Ob  ;;eci  Noue 

;  s:  3 

Fault 

9 

2 

f  faui tXlat  eC) 

Fault 

9 

2 

IBBBW'  5CC8B  - 

DC 

f ItXLATl 

; XLATE 

Fault 

9 

/ 

1 1 ook  upB: riding) 

Fault 

9 

2 

iBBBWf  SC0C6  •• 

DC 

f  .t  Dcukv.pBi  nalng 

;  SC  6 

Fault 

9 

? 

(enterBi  nd  1  r.g) 

Fault 

9 

2 

IBBBW'  $00CS  - 

DC 

fill  riterB i nd 1 ng 

;  SC  5 

Fault 

0 

(bl  ockSer.a) 

Fault 

9 

3 

IBBBW-  SCCC2  - 

DC 

! : t  Bi cckSend 

;SC2 

Fault 

9 

3 

(fau.iLlmiC) 
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Sample  Program 


fault : 
Fault 
Fault 
Fault 
Fault : 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Faul  t 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Faul  t 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Fault 
Faul  t 
Fa  u  1 1 
Fault 
Fault 
Fault 
Fault 
Fault 
Faul  t 
Faul  t 
Fault 
Fault 
Faul  t 
Fault 
Fault 


e 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

9 

0 

9 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 


3:  (BBBWj  $0088  - 
2:  (allocOb^ect) 

2:  (BBBWl  $00C4  - 
2:.  (lookupBlndi ng) 

2:  (BBBWf  $00C6  - 
2:  (blockMove) 

2;  iBBBW)  .r 

2:  (faulCLimitO) 

2:  (BBBW)  $0088  * 

2:  (faultXlateO) 

2:  {BBBW)  $008B  - 
2:  (lookupBindlng) 

2:  (BBBW)  $CjC6  - 
2:  (faultXlateO) 

2:  (BBBW)  $008B  > 

2:  (objectNode) 

2:  (BBBW)  $0003  - 
1:  (faultXlateO) 

1:  IBBBW)  $008B  - 
2:  (ObjectNode) 

2:  (BBBW)  $0003  - 
1 :  (lookupBindlng) 

1:  (BBBW)  $O0C6  - 
0:  (faultXlateO) 

0:  (BBBW)  $0086  - 
2:  (faultCFutO) 

2:  (BBBW)  $0080  - 
0:  (lookupBindlng) 

0:  (BBBW)  $00C6  - 
2:  (faultXlateO) 

2:  (BBBW)  SOOeB  ‘ 

1 :  (enterBinding) 

1:  (BBBW)  $00C5  « 

3:  (blockSeno) 

3:  (BBBW)  $00C2  = 

2:  (lookupBindlng) 

2:  I BBBW)  $C0C6  - 
0:  (enterBinding) 

0;  (BBBW)  $Q0C6  - 
3:  (faultLlmitO) 

3:  (BBBW)  $0088  • 

3:  (blockSend) 

3:  IBBBWI  S00C2  * 

1:  (allocObject) 

1:  (BBBW)  S00C4  - 
3;  (faultLirritO) 

3:  (BBBW)  $0068  - 
0:  (allocObject) 

C:  (BBBW)  $00C4  • 

1:  (lookupBindlng) 

1:  IBBBW)  $00C6  • 

0:  (iookupSindl ng) 

0:  (BBBW)  S00C6  - 
1:  (blockMove) 

1:  (BBBW)  SOOCl  * 

0:  (blockMove) 

C:  IBBBW)  SOOCl  • 

1:  (faultLinitO) 

1;  (BBBW)  $0088  * 

0:  (fauitLimitO) 

0:  (BBBW)  $0088  ' 

1:  (faultXlateO) 

1:  (BBBW)  $008B  * 

0:  (objectNoae) 

0:  (BBBW)  $0003  * 

1:  (objectNode) 

1:  (BBBW)  S00D3  • 

C:  (objectNode) 

0:  (BBBW)  $0003  - 
1;  (objectNode) 

1:  (BBBW)  S00D3  - 
3:  (objectNode) 

3:  (BBBW)  $0003  • 

2:  (faultXlateO) 

2:  (BBBW)  $0080  * 

0:  (faultCFutO) 

0:  (BBBW)  $0080  = 

1:  (faultCFutO) 

1:  (BBBW)  $0080  - 
3:  (objectNode) 

3:  (BBBW)  $0003  - 
2:  (lookupBindlng) 

2:  IBBBW!  S0CC6  - 
3:  (faultCFutO) 

3:  ( BBBW)  SC08D  - 
0:  (objectNode) 

0;  IBBBW)  S00D3  * 

1 ;  (object  Node) 

1:  (BBBW)  $0003  - 
0 :  (ob 1ect  Node) 

C;  (BBBW)  SCC03  ' 

1 :  (objectNode) 


DC 

ficLimit 

/LIMIT 

DC 

fitAliocObject 

;S04 

DC 

fl tLookupBindi ng 

;$06 

DC 

f ItBlockMove 

;S01 

DC 

fitLlmlt 

/LIMIT 

DC 

fltXLATE 

;XLATE 

DC 

fl tLookupBindi ng 

/$06 

DC 

fltXLATE 

/XLATE 

DC 

fitObjectNode 

/Si3 

DC 

fltXLATE 

/XLATE 

DC 

fitObjectNode 

;$13 

DC 

f ItLookupBindi ng 

;S06 

DC 

fltXLATE 

/XLATE 

DC 

fltCFUT 

/CFUT 

DC 

f ItLookupBindi ng 

/S06 

DC 

fltXLATE 

/XLATE 

DC 

fitEnterBinding 

;SC5 

DC 

f 1 1 BIock  Send 

;S02 

DC 

fltLookjpBinding 

/SC6 

DC 

fitEnterBinding 

/S05 

DC 

fULimit 

/LIMIT 

DC 

f ItBlockSend 

/S02 

DC 

fitAliocObject 

/S04 

DC 

fltLlmit 

/LIMIT 

DC 

fitAliocObject 

/S04 

DC 

f ItLookupBlnding 

/S06 

DC 

f ItLookupBinding 

/S06 

DC 

f ItBlockMove 

/SOI 

DC 

f  1 1  BlocicMove 

/SOI 

DC 

fULimit 

/LIMIT 

DC 

fULimit 

/LIMIT 

DC 

fltXLATE 

/XLATE 

DC 

f ItOb jeclNode 

/s:3 

DC 

fitObjectNode 

/SI  3 

DC 

f ItOc jectNode 

/SI  3 

DC 

f 1 tOb jectNode 

/S13 

DC 

fitObjectNode 

/S13 

DC 

fltXLATE 

/XLATE 

DC 

fUCFOT 

/CFUT 

DC 

fltCFUT 

/CFUT 

DC 

fitObjectNode 

;s:3 

DC 

f ,  t  LooKupB  indi  r.g 

;  S06 

DC 

fUCFUT 

/CFUT 

DC 

f 1 tOb jectNode 

/s:3 

DC 

f  UCbjectNode 

;  s :  3 

DC 

f 1 lObjectNode 

;S13 
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Fault 

9  1: 

{BBBWf  $0003  « 

DC 

f itOb jectNode 

;S13 

Fault 

9  2: 

(Ob jectNode) 

Fault 

9  2: 

(BBBW)  $0003  - 

DC 

f ItObjectNode 

;$13 

Fault 

9  0: 

(faultCFutO) 

Fault 

9  0: 

(BBBWi  $0080  - 

DC 

fltCFUT 

;CFUT 

Fault 

9  1: 

(faultCFutO) 

Fault 

9  1: 

(BBBW)  $0080  - 

DC 

fltCFUT 

;CFUT 

Fault 

9  3: 

(object Node) 

Fault 

9  3: 

(BBBW)  $0003  • 

DC 

f ItObjectNode 

;S13 

Fault 

9  2; 

(Ob jectNode) 

Fault 

9  2; 

(BBBW)  $0003  - 

DC 

f ItObjectNode 

;S13 

Fault 

9  3; 

(Ob jectNode) 

Fault 

9  3: 

(BBBW)  $0003  • 

DC 

f itOb ject Node 

;S13 

Fault 

9  2: 

(faultCFutO) 

Fault 

9  2; 

(BBBW)  $0080  • 

DC 

fltCFUT 

;CFUT 

Fault 

9  3: 

(faultCFutO) 

Fault 

9  3: 

(BBBW)  $0080  - 

DC 

fltCFUT 

;CFUT 

Fault 

9  0: 

(ob jectNode) 

Fault 

9  0: 

(BBBW)  $0003  - 

DC 

f ItOb ject Node 

;S13 

Fault 

9  0: 

(ob jectNode) 

Fault 

9  0: 

(BBBW)  $0003  • 

DC 

f ItObjectNode 

;S13 

Fault 

9  1: 

( faultXlateO) 

Fault 

9  1: 

(BBBW)  $0C8B  - 

DC 

fitXLATE 

XLATE 

Fault 

9  0: 

(faultCFutO) 

Fault 

9  0: 

(BBBW)  $0080  « 

DC 

fltCFUT 

cruT 

Fault 

9  2: 

(ob jectNode) 

Fault 

9  2: 

(BBBW)  $0003  • 

DC 

f ItObjectNode 

;S13 

Fault 

9  2: 

(ob jectNode) 

Fault 

9  2: 

(BBBW)  $0003  - 

DC 

f ItObjectNode 

;$13 

Fault 

9  2: 

(faui tCFutO) 

Fault 

9  2: 

(BBBW}  $0080  » 

DC 

fltCFUT 

;CFUT 

Fault 

9  2: 

(faultXlateO) 

Fault 

9  2; 

(BBBW)  $0C8B  - 

DC 

fitXLATE 

; XLATE 

Fault 

9  2: 

(faultXlateO) 

Fault 

9  2: 

(BBBW)  S008B  « 

DC 

fitXLATE 

; XLATE 

Fault 

9  1: 

( faultXlateO) 

Fault 

9  1: 

(Baaw)  $0088  - 

DC 

fitXLATE 

; XLATE 

Fault 

9  2: 

( faultXlateO) 

Fault 

9  2: 

(BBBW)  S008B  - 

DC 

fitXLATE 

; XLATE 

Tick  1543  Received  priority  0  message: 

OBJ;S0O1D90O4  u-1  f-0  o f f set *5007 66-Reply  length-SC004 
INT:$OOOOFCOO  •  64512 
INT:$00000000  -  C 
INT:S00000037  •  55 

90, 

1544  ticks  executed. 


Dynamic  Instruction  Dsage; 


STOP 

2887 

47.13% 

READ 

737 

12.03% 

WRITE 

500 

8.16% 

READR 

163 

2.  66% 

SEND 

160 

2.61% 

DC 

143 

2.33% 

BR 

130 

2.12% 

XLATE 

123 

2.01% 

ROT 

117 

1 . 91% 

ADD 

104 

1.70% 

AND 

96 

1.60% 

WRITER 

86 

1.44% 

BT 

71 

1.16% 

SEN02 

70 

1.14% 

BNIL 

69 

1,13% 

NOP 

64 

1.04% 

BF 

57 

0.93% 

LDIP 

54 

0.88% 

SUB 

52 

0.65% 

SUSPEND 

50 

0.82% 

XOR 

48 

0.78% 

CALL 

48 

0.78% 

WTAG 

44 

0.72% 

LDIPR 

4C 

0.65% 

EQ 

37 

0.60% 

SENDE 

26 

0.42% 

EQUAL 

25 

0.41% 

SEND2E 

24 

0.39% 

CHECK 

22 

0.36% 

RTAG 

21 

C.34% 

OR 

1  b 

C.24% 

GT 

14 

C.23% 

ASH 

:o 

0.16% 

ENTER 

7 

0.11% 

GE 

4 

0.07% 

BNNIL 

4 

0.07% 

NEG 

0 

0.00% 

NOT 

0 

0.00% 

FFB 

0 

0.00% 

INVAL 

0 

0.00% 

PROBE 

0 

0.00% 

LSH 

0 

0.00% 

NEO 

0 

C.00% 

MUL 

0 

C.00% 
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MULH: 

0 

0.00% 

NEQUAL: 

0 

0.00% 

CARRY: 

0 

0.00% 

HALT; 

0 

0.00% 

BZ: 

0 

0.00% 

LE: 

0 

0.00% 

8NZ; 

0 

0.001 

LT: 

0 

0.00% 

STOP: 

2887 

47.13% 

Move; 

1468 

24.29% 

ALU: 

407 

6.64% 

Branch: 

331 

5.40% 

Network : 

330 

5.39% 

Field; 

204 

3.33% 

DC; 

143 

2.33% 

Fault : 

142 

2.32% 

Assoc : 

130 

2.12% 

NOP: 

64 

1.04% 

Other : 

0 

0.00% 

Foregnd : 

3239 

52.87% 

Total : 

6126 

Fault  Usage: 


object Node: 

21 

26.25% 

faultXiateO : 

14 

17.50% 

lookupBlnding: 

11 

13.75% 

faultCFutO: 

10 

12.50% 

fauItLimltO: 

6 

10.00% 

blockSend : 

4 

5.00% 

allocOb ject : 

4 

5.00% 

enterBindlng: 

A 

5.00% 

blockMove; 

A 

5.00% 

Total : 

80 

The  xlate  hit  ratio  is  109  out  of  123  (  88.62%). 

376  words  sent  in  51  messages  on  priority  C, 

Average  message  size:  7.37. 

16.29  instructions/word  {8.61  foreground  Instructions/word) 

120.12  instructions/message  (63,51  foreground  Instructions/message) 
No  priority  1  words  sent. 


00. .3) 

Figure  5-14.  MDPSim  Transcript 

This  transcript  shows  a  MOPSim  session  in  which  the  user  loads  the  rangesum  assembly  code  and  calls  the  sum 
function  with  the  argument  10  on  a  2x2x1-node  J-Machine  with  COSMOS  using  only  internal  memory  (-msize 
0x1000).  Since  watching  faults  was  enabled.  MDPSim  prints  each  fault  encountered  at  each  MOP  as  it  is  run¬ 
ning.  The  fault  message  gives  the  number  of  the  MOP  on  which  the  fault  occurred,  the  number  of  the  fault  vector, 
and  the  name  of  the  fault;  the  {BBBW}  is  additional  MDPSim  breakpoint  and  watchpoint  information.  Finally,  after 
1544  steps  the  answer  55  is  produced  and  displayed. 

The  dynamic  instruction  statistics  for  the  run  are  also  shown.  About  half  ot  the  time  is  spent  distributing  the  func¬ 
tions  to  all  of  the  nodes;  the  second  time  sum  is  called  with  the  argument  10.  it  only  takes  893  ticks  to  produce  the 
answer  (a  tick  is  the  time  it  takes  every  node  to  execute  one  instruction;  MDPSim  assumes  that  every  instruction 
runs  in  the  same  amount  of  time). 


97 


Chapter  6.  Debugging 

Optimist  II,  Cosmos,  and  the  Concurrent  Smalltalk  application&  are  large  programs,  and  de¬ 
bugging  them  is  an  important  consideration.  I  will  not  discuss  the  process  of  debugging  Op¬ 
timist  II  itself;  standard  Common  Lisp  and  CLOS  techniques  such  as  building  firewalls  and 
providing  print  routines  for  important  data  structures  were  used. 

The  primary  approach  to  debugging  MDP  code  I  took  is  prevention.  I  made  sure  that  the 
Cosmos  design  was  sound  before  running  it.  The  criticality  criteria  were  very  helpful  in 
avoiding  re-entrancy  and  double  fault  problems.  Nevertheless,  while  the  prevention  ap¬ 
proach  was  successful  on  Cosmos  itself,  it  cannot  be  the  sole  debugging  method  used  on  the 
Concurrent  Smalltalk  programs.  Instead,  a  combination  of  debugging  means  at  various 
levels  has  been  provided. 

Debugging  Concurrent  Smalltalk  Code 

The  first  line  of  defense  is  the  Optimist  II  compiler  itself.  The  compiler  will  complain  when  it 
detects  errors  such  as  incorrect  function  argument  counts  or  bad  types,  if  types  are  declared. 

The  second  line  of  defense  is  the  interpreter  in  the  Optimist  II  compiler.  The  interpreter  can 
be  used  to  run  Concurrent  Smalltalk  programs  before  they  are  downloaded  into  MDPSim  or 
onto  a  J-Machine.  The  interpreter  provides  nearly  complete  checking  of  Concurrent 
Smalltalk  programs,  so  it  should  catch  most  of  the  remaining  bugs.  However,  the  interpreter 
will  not  catch  bugs  which  occur  only  on  large  data  sets,  nor  will  it  find  Cosmos’s  or  the  Opti¬ 
mist  II  code  generator’s  bugs. 

Debugging  MDP  Code  on  MDPSim 

Debugging  becomes  considerably  more  difficult  once  the  code  is  in  assembly  language  form. 
Fortunately,  Cosmos  does  include  some  facilities  for  debugging  Concurrent  Smalltalk  pro¬ 
grams. 

The  third  line  of  defense  is  comprised  of  the  safety  features  built  into  the  MDP  architecture. 
Type  and  bounds  checking  were  extremely  valuable  when  debugging  Cosmos,  as  they  catch 
most  common  type  errors  when  they  happen  and  prevent  runaway  programs  from  doing  too 
much  damage  to  the  machine  state.  Without  these  facilities  debugging  Cosmos  and  Concur¬ 
rent  Smalltalk  programs  could  have  been  intractable. 

'The  fourth  line  of  defense  consists  of  safety  checks  built  into  a  number  of  critical  places  in 
Cosmos.  These  checks  include; 

•  A  check  in  the  CFUT  handler  that  distinguishes  real  cfutures  from  uninitialized  vari¬ 
ables,  together  with  the  initialization  of  memory  and  globals  to  values  that  will  cause  CFUT 
faults. 

•  Checks  in  the  XLATE  and  INVADR  handlers  for  references  to  primitive,  nonexistent,  or 
deleted  objects.  Without  these  checks,  such  references  would  generate  messages  that  wander 
about  the  J-Machine  forever. 

•  A  check  in  the  Return  handler  to  make  sure  that  the  context  was  expecting  the  value 
that  was  returned.  This  check  catches  the  extremely  elusive  bug  of  replying  to  the  same  con¬ 
tinuation  twice,  as  the  second  reply  message  may  overwrite  a  variable  in  the  context  after  it 
has  been  reallocated  to  a  completely  unrelated  function.  The  bug  will  be  caught  eventually, 
even  if  the  second  function  stores  a  cfuture  into  the  same  context  location,  because  then  there 
will  still  be  two  replies  to  the  same  context  location,  and  the  cycle  will  repeat  itself  Of 
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course,  by  the  time  the  bug  will  be  caught,  the  original  evidence  may  be  gone,  but  at  least 
there  will  be  some  indication  of  a  problem. 

•  A  check  in  the  Co  routine  for  a  reference  to  a  nonexistent  constituent  of  a  distributed  ob¬ 
ject. 

•  A  HALT  on  any  reference  out  of  bounds  of  any  object  except  in  BlockMove  and  Block- 
Send. 

•  HALT  instructions  on  any  type  or  overflow  faults  that  occur  in  the  course  of  execution  of 
Concurrent  Smalltalk  programs. 

Furthermore,  MDPSim  does  its  part  to  make  debugging  easier.  Once  the  operating  system  is 
loaded,  memory  used  by  the  operating  system  code  is  read  and  write-protected  (it  may  only 
be  executed)  to  catch  any  runaway  references  to  it.  Since  dereferencing  nil  is  a  common 
mistake  in  the  MDP’s  unchecked  mode,  physical  memory  locations  0  through  3  have  been 
protected  from  all  accesses  to  catch  any  routines  that  dereference  nonexistent  objects.  More¬ 
over,  MDPSim  immediately  halts  if  a  message  is  sent  to  a  nonexistent  node. 

MDPSim  includes  the  halt  instruction  which  is  not  present  on  the  MDP.  The  HALT  instruc¬ 
tion  immediately  halts  the  simulated  J-Machine  without  altering  any  state.  However,  the 
HALT  instruction  can  almost  be  emulated  on  the  J-Machine — executing  halt  will  cause  ei¬ 
ther  an  INVINST  or  a  catastrophe  fault,  which  can  be  intercepted. 

Moreover,  the  newest  MDPSim  [25]  includes  hazard  detection — MDPSim  7.0  will  complain 
and  optionally  stop  the  program  if  it  detects  an  unsafe  programming  construct  such  as  refer¬ 
encing  the  FIR  register  if  it  could  have  been  altered  by  an  asynchronous  interrupt  or  sending 
a  message  when  the  F  bit  is  set  (a  network  send  fault  could  be  catastrophic  in  this  case). 
Clearly  MDPSim  cannot  discover  all  such  possible  bugs,  but  it  can  provide  considerable  as¬ 
sistance  in  uncovering  sporadic  asynchronous  bugs. 

Finally,  MDPSim  is  deterministic — running  the  same  program  twice  will  always  yield  identi¬ 
cal  results.  Thus,  if  an  inexplicable  bug  occurs,  it  can  always  be  reproduced.  Moreover,  ear¬ 
lier  snapshots  in  time  can  be  examined  by  running  the  same  session  again  in  MDPSim.  On 
the  Macintosh  version  of  MDPSim,  the  entire  session  is  automatically  saved,  making  repro¬ 
ducing  it  easy. 


Debugging  MDP  Code  on  a  J-Machine 

Debugging  code  on  a  real  J-Machine  is  still  harder  than  debugging  it  with  MDPSim.  Cosmos 
currently  does  not  include  any  facilities  specifically  designed  for  such  debugging  other  than 
the  ones  described  above,  but  such  facilities  are  being  added  in  the  true  J-Machine  version  of 
it.  The  primary  facilities  consist  of  a  set  of  mousetraps  to  catch  weird  conditions  such  as 
hardware  errors  and  a  set  of  fault  handlers  that  interact  with  the  host  through  the  diagnostic 
port.  Unfortunately,  it  is  impossible  to  examine  an  MDP’s  state  without  destroying  some 
register  values,  so  debugging  on  the  hardware  is  much  harder. 

Assuming  one  can  stop  the  computation  at  a  safe  point,  it  is  possible  to  get  a  dump  of  all 
memory  and  most  registers  on  each  MDP  in  a  J-Machine.  What  does  one  do  with  a  huge 
dump  of  the  state  of  a  J-Machine?  One  possible  course  of  action  would  be  to  examine  it  using 
MDPSim’s  debugging  facilities.  Another  possibility  is  periodically  checkpointing  the  compu¬ 
tation  on  the  J-Machine  by  saving  images.  If  a  crash  occurs,  earlier  images  can  be  examined 
or  restarted  to  determine  the  cause  of  the  crash. 

Summary 

Debugging  Concurrent  Smalltalk  code,  while  not  especially  easy,  is  not  impossible.  Several 
lines  of  defense  against  bugs  in  Concurrent  Smalltalk  programs  are  provided.  It  is  highly 
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recommended  to  try  to  find  bugs  in  the  earlier  steps  of  the  compilation  process  because  the 
tools  at  those  levels  are  more  robust  and  informative  (but  not  as  faithful  to  the  J-Machine). 

Although  Cosmos  includes  many  checks  for  the  common  Concurrent  Smalltalk  programming 
errors,  Cosmos  does  not  protect  itself  from  itself — it  does  not  detect  corruption  in  its  data 
structures.  Fortunately,  segmentation  by  the  MDP  ensures  that  those  data  structures  could 
only  be  corrupted  by  Cosmos  itself,  as  well -compiled  Concurrent  Smalltalk  programs  cannot 
reference  data  outside  their  segments.  Cosmos  was  mainly  debugged  by  design,  with  only 
minor  debugging  necessary  once  the  operating  system  was  written. 

MDPSim  also  helps  in  debugging  MDP  code  by  providing  watchpoints,  breakpoints,  the  halt 
instruction,  hazard  detection,  and  determinism,  which  allows  any  bug  to  be  reproduced. 
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Both  Cosmos  and  the  code  output  by  Optimist  II  were  optimized  for  speed.  This  chapter  pre¬ 
sents  some  measurements  that  determine  just  how  fast  compiled  Concurrent  Smalltalk  runs 
on  a  J-Machine.  Both  theoretical  derivations  and  real  measurements  are  presented  and 
compared.  Both  calculations  indicate  that  the  average  grain  size  (the  ratio  of  useful  instruc¬ 
tions  executed  to  messages  sent)  for  running  Concurrent  Smalltalk  on  a  J-Machine  is  be¬ 
tween  50  and  70  instructions,  and  the  average  number  of  instructions  executed  per  method  is 
about  100  instructions.  This  is  a  pity  if  the  average  method  only  performs  a  few  instructions’ 
worth  of  real  computation,  yet,  since  Cosmos  and  the  code  output  by  Optimist  II  are  already 
heavily  optimized,  it  does  not  seem  likely  that  incremental  changes  will  reduce  these  num¬ 
bers  much  further. 

In  addition  to  the  above  figures,  various  other  statistics  are  presented.  The  static  and  dy¬ 
namic  instruction  use  frequencies  were  collected  to  identify  areas  in  which  the  MDFs  hard¬ 
ware  performance  could  be  improved;  no  major  surprises  were  found  there.  These  frequen¬ 
cies  indicate  that  the  MDP  spends  an  average  of  about  2  cycles  per  instruction;  this  number 
increases  to  4  if  slow  external  DRAM  is  used  to  hold  the  user  program  and  data. 

Finally,  the  network  load  is  analyzed.  The  network  should  not  become  saturated  until  more 
than  343  MDPs  are  put  together;  if  a  larger  J-Machine  is  to  be  built,  either  the  network  will 
have  to  be  made  faster,  the  operating  system  slower,  or  considerable  attention  will  have  to  be 
paid  to  locality. 
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7.1.  Derived  Times 

This  section  presents  some  rough  estimates  of  the  overhead  on  the  J-Machine.  A  number  of 
assumptions  are  made  when  making  these  estimates;  the  results  of  actual  measurements 
will  be  reported  in  the  next  section  to  verify  those  assumptions. 

Cosmos  Estimates 

The  instruction  counts  needed  for  various  important  Cosmos  services  are  shown  in  Table  7-1. 
The  counts  are  approximate,  but  usually  accurate  to  within  a  few  instructions.  The  counts 
listed  may  not  be  completely  correct  due  to  approximations  in  some  routines. 


Table  7>1.  Selected  Cosmos  Routine  Instruction  Counts 


Routine  I  Instruction  Count  i  Description 

Method  and  Control  Managers 

Apply 

3+ApplySelector  or 
5+ApplvFunction 

Dispatch  a  general  Apply  message. 

msmaHi 

4 

Dispatch  an  ApplyFunction  message. 

>23  (>15+LookupMethodU) 

Dispatch  an  ApplySelector  message. 

LookupMethod 

8+LookupMethodU 

Lookup  a  method  given  a  class  and  a 
selector. 

LookupMethodU 

8  on  cache  hit, 

40+SaveStatelD023+message  la¬ 
tency  on  cache  miss. 

Internal  core  of  LookupMethod. 

CFUT  Fault 

=30+2msize  if  context  available 
on  queue  (14+SaveStafelD023) 

Save  state  when  a  cfuture  was  read 
from  the  context. 

Reply 

27  if  process  is  restarted; 

12  if  not. 

Process  a  reply  message. 

RestariContext 

20 

Unconditionally  restart  a  context. 

I  Context  Manager  I 

SaveStatelD023 

14  if  message  already  saved  in 
context  and  new  context  avail¬ 
able  in  queue 

16+2msize  if  context  available  on 
queue; 

17+2msize+AllocNextObject 

otherwise. 

Save  the  ID  registers  and  the  mes¬ 
sage  in  the  context,  save  the  context, 
and  suspend. 

I  Global  Obiect  Manager  I 

NewObject 

=37+2msize+Reply 

(21+SaveStatelD023+Replv) 

Allocate  a  remote  object. 

ClassOf 

15  to  25  (lO+TypeOf) 

Return  the  class  of  an  object. 

Type  Of 

5  to  15,  depending  on  tag 
(5  for  integers,  11  for  ordinary 
user  obiects,  varies  for  others) 

Internal  core  of  ClassOf. 

ObjectNode 

9  for  primitive  objects; 

4  for  ordinary  user  objects; 

32  for  distributed  obiects. 

Return  the  node  most  likely  to  con¬ 
tain  the  object  or  a  random  node  if 
the  object  is  primitive. 

Co 

38  (49  when  the  object  has  more 
constituents  than  there  are 
nodes) 

Return  the  ID  of  the  nth  constituent 
of  a  distributed  object. 
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PreferredConst 

27  (12  when  the  object  has  more 
constituents  than  there  are 
nodes) 

Return  the  ID  of  a  nearby  con¬ 
stituent  of  a  distributed  object. 

MigrateObject 

=62-hAllocObject+LookupBinding-h 

2  size  (may  vary  if  more  or  fewer 
contexts  are  restarted) 

Receive  and  install  an  object  and 
restart  a  context  waiting  for  it. 

UpdateHome 

>  1 1-h  LookupBi  ndi  ng 

Update  a  migrated  object’s  home 
BRAT  entry. 

Unlock 

9 

Unlock  an  obiect. 

1  Local  Obiect  Manager  1 

NewLocalObject 

3-f-AllocNextObject 

Allocate  a  local  object  of  the  given 
class. 

AllocNext  Object 

12-r  AllocObject -h  EnterBinding 

Allocate  a  local  object  using  the  next 
ID  and  the  given  header  word. 

DeallocateObiect 

11-t- PurgeBinding 

Deallocate  a  local,  unlocked  object. 

1  BRAT  Manager  1 

EnterBinding 

26  (35  if  no  free  BRAT  entries 
were  available;  may  also  compact 
heap) 

Allocate  a  new  BRAT  binding. 

LookupBinding 

14-h5n,  where  n  is  the  number  of 
links  traversed  in  linked  list. 

Lookup  a  binding  in  the  BRAT. 

PurgeBinding 

2-p  DeleteBinding 

Delete  a  binding  from  the  BRAT  and 
the  XLATE  table. 

DeleteBinding 

23+5n,  where  n  is  the  number  of 
links  traversed  in  linked  list. 

Delete  a  binding  from  the  BRAT. 

!  Heap  Manager  1 

AllocObject 

20;  may  also  compact  heap 

Allocate  an  obiect  on  the  heap. 

CompactHeap 

varies  from  2N  to  ION  or  more, 
where  N  is  the  size  of  the  heap. 

Compact  the  heap. 

1  Utilities  1 

Divide 

from  40  for  small  numbers  to  400 
for  large  numbers. 

Divide  two  32-bit  numbers  and  re¬ 
turn  the  quotient  and  remainder. 

1  Faults  1 

Early  Fault 

8 

Penalty  for  reading  data  from  mes¬ 
sage  queue  too  fast. 

Send  Fault 

8 

Penalty  for  sending  data  into  net¬ 
work  too  fast. 

Some  Definitions: 

size  is  the  size  of  the  object. 

msize  is  the  size  of  the  message  in  the  queue,  if  the  message  has  already  been  saved  and  the  O  flag  is  false, 
msize  is  defined  to  be  -1  for  the  purposes  of  the  above  timings.  If  msize  is  mentioned  in  a  time  expression,  the 
current  process  is  suspended  and  later  restarted:  the  time  does  not  include  the  time  between  the  suspension  and 
the  resumption  because  other  processes  are  assumed  to  execute  then. 


User  Program  Estimates 

In  contrast  to  the  counts  in  Table  7-1,  an  examination  of  the  rangesum  method  in  Figure  5-12 
shows  that  it  takes  about  13  instructions'  to  execute  a  function  or  method  call  and  about  8 
instructions  to  return  a  reply  and  suspend  (see  Table  7-2),  Thus,  the  typical  time  the  MDPs 
spend  in  user  code  to  execute  a  function  call  and  return  is  about  21  instructions;  perhaps  a 
few  more  instructions  are  used  for  primitives,  but  the  user  code  execution  time  is  seldom 
more  than  30  instructions  per  function  invocation.  Hence,  estimating  conservatively,  any 


'There  are  several  NOPs  not  shown  in  the  listing  caused  by  alignment  around  DCs, 
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Table  7*2.  Selected  User  Action  Instruction  Counts 


Action 

Instruction  Count 

Description 

Function  or 

Method  Call 

«»ll+nargs.  May  be  higher  if  ar¬ 
guments  must  be  touched  or 
lower  if  many  SEND2s  are  used. 

Call  a  function  or  a  method.  The 
time  does  not  include  the  CFUT  fault 
or  replv  time. 

Reply  with 

Suspend 

8-10 

Return  a  reply  to  the  caller. 

1—4  for  instructions  and  up  to 

400  or  more  for  svstem  calls. 

Perform  a  primitive  operation  such 
as  an  addition  or  a  conditional. 

Nargs  is  the  number  of  arguments  sent  in  the  application  message. 


time  above  30  instructions  per  function  or  method  invocation  is  spent  in  the  operating  sys- 
tem^ 

Analysis 

A  juxtaposition  of  the  main  figures  from  Tables  7-1  and  7-2  reveals  that  a  typical  program 
will  spend  about  70%  of  its  active  time  in  the  operating  system  and  30%  of  the  time  in  user 
code.  Furthermore,  the  program  will  take  about  100  instructions  per  function  invoked,  ex¬ 
cept  for  tail-forwarded  functions  which  will  only  take  about  25  instructions  each.  About  20 
extra  instructions  should  be  added  for  each  method  dispatch  that  the  compiler  is  unable  to 
optimize  out.  To  derive  these  estimates  the  following  system  of  accounting  is  used:  the  work 
ascribed  to  a  function  invocation  consists  of  all  work  needed  to  call  the  function  on  the  origi¬ 
nating  node  plus  all  work  needed  to  dispatch  the  function  on  the  called  node,  but  not  includ¬ 
ing  the  work  done  by  the  called  function  to  call  other  functions. 

Standard  Invocations 

Each  non-tail-forwarded  function  invocation  requires  the  processing  of  an  Ob  jectNode  call, 
a  function  message  send,  a  reply  message,  and  optionally  a  cfuture  fault  on  the  originating 
node,  and  a  function  dispatch  and  a  reply  on  the  called  node.  Assuming  that  the  average 
function  call  has  two  arguments,  the  total  operating  system  work  for  the  above  activity  is: 

Ob jectNode  +  ApplyFunction 

+  c(CFUT  fault  +  restarting  Reply)  +  (l-c)(non-restarting  Reply) 

=  9  +  4  +  69c  +  12(l-c) 

=  25  +  57c  instructions. 

c  is  the  probability  that  a  cfuture  will  be  referenced  before  being  replaced  by  the  returned 
value.  TTiis  probability  can  vary  over  a  wide  range  depending  on  the  branching  factor  of  the 
program  call  graph,  c  is  1.0  for  a  recursive  factorial  program  and  0.5  for  a  recursive  fibonacci 
or  rangesum  program.  If  a  branching  factor  between  1  and  2  is  assumed,  c  will  be  somewhere 
between  0.5  and  1.0;  suppose  it  is  0,75,  which  results  in  68  instructions  executed  in  the  oper¬ 
ating  system  per  function  invocation. 

The  total  user  code  work  is 

Function  call  +  Primitives  called  by  function  +  Reply  with  Suspend. 

The  time  spent  executing  primitives  will  vary  greatly  depending  on  the  application;  10  in¬ 
struction  seems  reasonable  for  most  cases,  although  it  will  be  higher  if  the  user  program  calls 
Divide  or  allocates  objects.  Substituting  this  number  and  the  average  number  of  arguments 
yields  a  total  user  code  work  of 


*Tail-forwarded  calls  are  cheaper  because  the  net  cost  of  a  tail-forwarded  call  is  one  call  and  no  return,  which  is 
about  IS  user  code  instructions. 
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13  +  10  +  9  =  32  instructions. 

Thus,  the  total  amount  of  work  taken  to  process  one  function  invocation  is  100  instructions, 
out  of  which  about  10  instructions  (the  primitives)  could  be  construed  as  being  “useful"  work 
and  the  rest  overhead.  This  figure  does  not  include  any  object  migration  or  XLATE  miss 
overhead.  These  results  should  not  be  interpreted  as  implying  that  an  MDP  running  Cosmos 
has  a  performance  10  times  slower  than  a  comparable  processor  in  a  sequential  computer  be¬ 
cause  sequential  computers  also  have  a  considerable  function  calling  and  parameter  passing 
overhead. 

Tail-Forwarded  Invocations 

Tail-forwarded  applications  are  considerably  more  efficient.  Using  the  accounting  method 
outlined  above  results  in  ascribing 


Figure  7-1.  Function  Invocation  Latency 

The  latency  of  the  network  is  estimated  at  about  10  instruction  times  (20  cycles)  to  send  a  message  between  two 
randomly  chosen  nodes  on  a  4096-node  machine. 

If  n  is  the  time  taken  by  the  called  function,  the  latency  of  invoking  a  function  is  9+1 1+1O+4+0+1O+27+/7  =  79+n 
instructions  unless  the  called  function  takes  fewer  than  12  instructions,  in  which  case  the  latency  is  9+11+2+42+27 
=  91  instructions. 
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Ob  jectNode  +  ApplyFunction  =  13  instructions 

operating  system  overhead  and 

Function  call  +  Primitives  called  by  function  =  23  instructions 

user  code  work.  The  total  work  done  is  36  instructions,  out  of  which  again  10  instructions  is 
“useful"  work. 

Latency 

The  preceding  analysis  calculated  the  total  amount  of  work  needed  per  function  invocation  in 
a  program,  which  determines  throughput  on  a  fully  loaded  system  in  which  each  processor  is 
busy;  however,  another  important  component  of  performance  is  latency.  It  turns  out  that  the 
latency  of  a  function  Invocation  can  be  lower  than  the  amount  of  work  done  by  the  function 
invocation  because  two  processors  (the  caller  and  the  callee)  can  execute  much  of  the  function 
invocation  in  parallel. 

Assuming  no  other  activity  in  the  system,  a  non -tail-forwarded  function  invocation  will  con¬ 
sist  of  the  caller  sending  a  message  to  the  callee.  Then  the  callee  evaluates  the  function, 
while  the  caller  takes  a  efuture  fault  (or  calls  another  function,  but  this  won’t  matter).  Un¬ 
less  the  called  function  is  very  short,  the  ca'ler  will  finish  the  efuture  fault  processing  and 
then  idle  before  it  gets  the  reply  message  from  the  callee.  Finally,  the  callee  replies  to  the 
caller,  which  restarts  the  calling  process. 

As  can  be  seen  in  Figure  7-1,  the  latency  of  a  function  call  is  79  instructions  in  addition  to  the 
time  taken  to  execute  the  function;  if  the  function  takes  fewer  than  12  instructions  to  exe¬ 
cute,  the  overall  latency  is  91  instructions.  These  numbers  are  less  than  the  total  amount  of 
work  done  by  the  system  (104  instructions). 

Summary 

The  results  above  indicate  that  the  number  of  instructions  needed  to  process  a  function  invo¬ 
cation  for  Cosmos  running  on  a  J-Machine  should  be  about  100  instructions,  with  the  notable 
exception  of  tail-forwarded  functions,  which  require  only  about  36  instructions.  The  instruc¬ 
tion  counts  may  be  higher  if  many  primitive  calls  are  made  or  if  the  operating  system  faults 
often. 
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7.2.  Measurements 


Grain  Size  and  Machine  Load 

To  attempt  to  measure  the  J-Machine’s  performance  and  grain  size,  I  ran  several  programs, 
including  factorial  (Figure  7-2);  rangesum  as  listed  in  Chapter  5;  rangesum2  (Figure  7-3), 
which  is  a  version  of  rangesum  which  builds  and  traverses  a  data  structure;  and  sort  (Figure 
7-4),  which  generate  ?  and  sorts  an  array  of  r*  pseudo-random  numbers  using  the  Batcher 
parallel  sort  technique  described  on  page  112  of  [281. 

(defun  fact  (n) 

(if  (zero?  n) 

1 

(*  n  (fact  (-  n  1)  )  )  )  ) 

Figure  7-2.  Tactorial  Program 

(dafclass  pair  (object) 
car 
cdr ) 

(defun  cons  (x  y):pair 

(put-car-cdr  (new  pair)  x  y) ) 

(dcfmethod  put-car-cdr  pair  (x  y) :pair 
(cset  car  x) 

(cset  cdr  y) 
self) 

(defun  ma)ce-countlist  (low:  integer  high:  integer) 

(if  (>  low  high)  (halt)) 

(if  {«  low  high) 
low 

(let  ((middle  (//  (+  low  high)  2))) 

(cons  (ma)ce-countlist  low  middle) 

(ma)ce-countlist  (+  middle  1)  high))))) 

(def method  reduce  pair  (op:funct) 

(op  (reduce  car  op)  (reduce  cdr  op) ) ) 

(defmethod  reduce  integer  {op:funct) 
self) 

(Jefun  add  (x  y) 

(+  X  y)) 

(defun  reduc«-add  (tree) 

(reduce  tree  add)  ) 

(defmethod  ramp  integer  {) 

(ma)^e-countlist  0  self)) 

(defmethod  r.Tgeoum2  integer  () 

(reduce-add  (r=imp  self)  )  ) 

Figure  7-3.  Rangesum2  Program 

This  program  exercises  several  Concurrent  Smalltalk  object  facilities  such  as  allocating  objects  and  traversing 

trees 
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(defclass  discarray 
value) 


(di  stobj) 


(defmethod  initialize  distarray  {low, high: integer  fzfunct) 
(if  (*  low  high) 

(cset  (get-value  (co  group  low)'  (f  low)) 

(clet  ((middle  {//  (+  low  high)  2))) 

(concurrently 

(initialize  group  low  middle  f) 

(initialize  group  (•►  middle  1)  high  f)  )  )  )  ) 


(defun  ma)ce-di starray  (n  modulus) 

(clet  {(da  (new  distarray  n) ) ) 

(initialize  da  0  (-  n  1)  (lambda  (x)  (mod  (*  x  x  x)  modulus))) 
da)  ) 


(defmethod  sort-exchanges  distarray  ( low, high, p, r, d: integer ) 

(if  (<=  low  high) 

(if  (*  low  high) 

(clet  ((1qw2  (+  low  d)  )  ) 

(clet  (  (vl  (get-value  (co  group  low))) 

(v2  (get-value  (co  group  low2)))) 

(if  (>  vl  v2) 

(concurrently 

(cset  (get-value  (co  group  low))  v2) 

(cset  (get-value  (co  group  iow2))  vl))))) 

(clet  ((middle  (//  (+  low  high)  2))) 

(concurrent  1 y 

( sort-exc)ianges  group  low  middle  pro) 

(sort-exchanges  group  {+  middle  1)  high  p  r  d)))))) 

(defmethod  sort-q  distarray  (p, q, r, d: integer) 

(sort-exchanges  group  0  (-  ( 1  ogi ca  1 -1  i mi t  self)  (•»  d  1))  p  r  d) 

(i  f  (<>  p  q) 

(sort-q  group  p  (//  q  2)  p  {-  q  p) ) ) ) 

(defmethod  sort-p  distarray  (hal f, p: integer) 

(sort-q  group  p  half  0  p) 

(if  (>  p  1 ) 

(sort-p  group  half  (//  p  2)) 
group) ) 

(defmethod  sort  distarray  0 

(clet  ((half  (ash  1  (-  ( i  nteger-length  (-  ( logical -1 imi t  self)  1))  1)))) 

(sort-p  group  half  half))) 


(defun  sort -di st array  (n  modulus) 

(sort  (make-di starray  n  modulus))) 

Figure  7-4.  Sort  Program 

Sort-dis (array,  given  the  values  o(  n  and  modulus,  sorts  an  array  of  n  pseudo-random  numbers.  The  Ah  pseudo¬ 
random  number  is  equal  to  Amod  modulus.  The  Batcher  sort  algorithm  is  used,  as  presented  on  page  1 12  of  [28]. 

Measurements  were  done  on  a  4-node  and  a  16-node  simulated  J-Machine.  The  results  of  the 
trials  are  summarized  in  Table  7-3. 

The  grain  size  is  the  third  number  in  the  working  instructions  executed  column.  The  time  to 
process  one  function  invocation  is  approximately  twice  the  grain  size  unless  tail-forwarding 
is  used  extensively.  Except  for  sorting  4  numbers  and  the  trivial  factorial  case,  the  results 
indicate  function  invocation  times  of  between  81  and  162  instructions,  which  means  that  the 
estimate  of  100  in  the  previous  section  was  about  right.  Many  of  the  functions  in  the  sort 
sample  program  are  tail-forwarded,  so  the  average  function  invocation  time  for  that  example 
is  less  than  twice  the  grain  size.  In  addition,  the  sort  program  has  a  grain  size  higher  than 
predicted  in  the  previous  section.  This  is  probably  due  to  frequent  calls  to  the  multiplication, 
division’,  and  co  primitives  as  well  as  to  distribution  of  large  code  objects;  the  grain  size  does 
decrease  for  larger  input  values. 


1 


A  division  by  2  is  just  a  single  ASH  instruction,  but  the  division  in  make-distarray  requires  a  complete  ?.ivide  call. 
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Table  7-3.  Performance  Measurements 


■1 

1 

i 

Input 

Invo 

catlo 

ns 

Start¬ 

up 

Total  Instructions 
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•The  average  message  length  includes  the  address  tvord  sent  at  the  beginning  of  each  message.  That  word  is 
kept  as  the  message  is  routed  through  the  network  but  removed  before  the  message  is  inserted  into  the  queue  on 
the  destination  node. 

The  working  instruction  counts  are  instruction  counts  with  all  stop  instructions  executed  in  background  loops  re¬ 
moved:  they  represent  the  useful  work  done  in  the  syste.m. 

The  three  numbers  in  the  total  instructions  executed  and  working  instructions  executed  columns  give  the  absolute 
numbers  of  instructions  executed,  the  numbers  of  instructions  per  word  of  network  traffic,  and  the  number  of  in¬ 
structions  per  network  message,  in  that  order. 

A  cold  startup  indicates  that  the  program  was  executed  just  after  it  was  loaded:  a  considerable  portion  of  the  run¬ 
ning  time  is  spent  on  distributing  the  functions  to  all  nodes  that  need  them. 

A  warm  startup  indicates  that  the  program  was  executed  after  the  functions  it  needed  were  already  installed  on 
every  node. 

A  hot  startup  indicates  the  third  trial  of  the  program  on  the  particular  input.  This  time  may  be  less  than  the  warm 
startup  time  because  the  previous  trials  have  preallocated  enough  standard  contexts  on  the  MDPs  to  let  the  pro¬ 
gram  run  without  the  need  to  allocate  any  more  contexts.  Warm  and  hot  startup  times  are  probably  the  most  rep¬ 
resentative  of  the  J-Machine's  performance  on  larger  problems. 

The  geometry  of  the  J-Machine  does  not  have  much  of  an  effect  on  a  program  simulated  under  MDPSim.  It  is 
unimportant  anyway  for  the  small  sizes  simulated  above. 

Using  inputs  much  larger  than  50  for  the  range  sums  or  100  tor  the  sort  generated  too  much  concurrency  and 
caused  the  message  queues  to  overflow  See  Chapter  8  for  a  possible  solution  to  this  problem. 

Another  pattern  in  Table  7-3  is  that  the  percentage  of  the  J-Machine  that  is  busy  is  higher 
for  the  larger  problems,  which  was  to  be  expected.  Also,  the  warm  and  hot  start  programs 
tended  to  exhibit  more  concurrency  than  the  cold  start  ones';  apparently  there  is  some 
wasted  time  during  the  initial  code  distribution  phase. 


'This  was  not  the  case  in  a  slightly  earlier  version  of  Cosmos,  possibly  because  it  was  less  cfRcicnt  and  therefore  had 
more  work  to  do. 
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Comparison  with  Dataflow 

Ellen  Spertus  made  a  few  performance  numbers  available  for  her  implementation  of  dataflow 
on  the  J-Machine  [34].  I  compared  her  timings  with  those  obtained  by  Optimist  Il/Cosmos  on 
the  same  examples.  The  program  used  was  the  factorial  function  listed  in  Figure  7-5. 

The  dataflow  interpreter  took  431  steps  to  compute  the  factorial  of  4.  The  Concurrent 
Smalltalk  version  of  the  factorial  program  took  725  steps  to  execute  from  a  cold  start  but  only 
265  steps  from  a  hot  start.  The  dataflow  interpreter  allocates  code  statically  and  references 
absolute  addresses,  so  every  timing  is  effectively  a  hot  start.  The  dataflow  interpreter  took 
628  steps  to  compute  three  factorials  of  4  in  parallel,  while  the  Concurrent  Smalltalk  code 
took  399  steps  to  complete  the  task.  Thus,  for  this  simple  example  the  Concurrent 
Smalltalk/Optimist  Il/Cosmos  combination  is  faster  than  dataflow,  but  not  by  much.  How¬ 
ever,  Concurrent  Smalltalk  is  more  dynamic  than  the  current  dataflow  system  in  [34]. 

(defun  fact  (n) 

(if  «-  n  1) 

1 

(•  n  (fact  (-  n  1)  >  > )  ) 

Figure  7-5.  Factorial  Program  used  in  Dataflow 

Network  Load 

As  seen  in  Table  7-3,  the  network  loading  is  usually  between  one  word  every  8  instructions 
and  one  word  every  20  instructions,  with  the  earlier  figure  dominating  as  the  J-Machine  uti¬ 
lization  approaches  100%.  If  an  average  MDP  instruction  length  is  taken  to  be  2.0  cycles, 
this  implies  that  a  program  could  inject  words  into  the  network  as  fast  as  one  word  every  16 
cycles  on  every  MDP. 

Suppose  that  we  run  one  of  the  above  programs  on  a  J-Machine  organized  as  a  kxkxk  mesh. 
Let  N=kykxk  be  the  number  of  nodes.  To  a  first-order  approximation,  the  capacity  of  the 
network  is  3A^  half-word-hops/cycle*,  or  1.5N  word-hops^cycle.  Assuming  random  sources  and 
destinations,  a  message  will  have  to  travel  an  average  of  k/3  nodes  on  each  of  the  three 
dimensions,  so  the  expected  distance  the  message  has  to  travel  is  3k/3  =  k  nodes.  Hence,  the 
network’s  theoretical  capacity  is  the  delivery  of  l.5N/k  =  l.5k^  words  per  cycle.  On  the  other 
hand,  the  program  offers  IV/16  words/cycle  to  the  network,  which  means  that  unless  locality 
is  exploited  or  the  program  slowed  down,  there  will  be  an  upper  bound  on  the  size  of  the  J- 
Machine  which  can  run  Cosmos. 

A  mesh  loaded  at  about  30%  of  its  theoretical  capacity  should  be  able  to  route  messages 
without  excessive  delays  [32].  To  calculate  the  maximum  k,  set 

0.3xl.5;fe2  ^  ^3/16 

k  =  7.2. 

Thus,  the  network  should  not  become  a  critical  resource  until  a  J-Machine  with  over  7^  =  343 
nodes  is  built.  If  the  network  routing  speed  is  doubled,  network  loading  should  not  be  prob¬ 
lematic  until  the  J-Machine  exceeds  14^  =  2744  nodes.  On  the  other  hand,  should  the  Cos¬ 
mos  operating  system  be  sped  up  somehow,  the  critical  size  might  fall  below  343  nodes.  Seri¬ 
ous  attention  to  locality  will  have  to  be  paid  if  a  J-Machine  larger  than  a  few  hundred  nodes 
is  built;  conversely,  if  only  a  small  J-Machine  is  built,  it  may  not  be  adequate  for  testing  al¬ 
gorithms  for  exploiting  locality  because  almost  any  algorithm  will  work. 


*The  J-Machine  network  can  transmit  half  a  word  between  every  pair  of  adjacent  MDPs  on  every  cycle. 
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Table  7-4.  Static  Instruction  Frequencies 


Instruction 

Count 

Freq. 

Instruction 

Count 

Freq. 

DC 

440  " 

19.33% 

XOR 

13 

READ 

324 

14.24% 

EQUAL 

11 

0.48% 

WRITE 

210 

9.23% 

ENTER 

10 

0.44% 

NOP 

173 

7.60% 

SENDE 

10 

0.44% 

WRITER 

104 

4.57% 

SEND2E 

10 

0.44% 

READR 

88 

3.87% 

NEC 

9 

0.40% 

BR 

86 

3.78% 

BZ 

9 

0.40% 

SEND 

80 

3.51% 

BNZ 

9 

0.40% 

ROT 

64 

2.81% 

PROBE 

7 

0.31% 

HALT 

64 

2.81% 

EQ 

7 

0.31% 

ADD 

59 

2.59% 

LT 

5 

0.22% 

AND 

50 

2.20% 

GT 

5 

0.22% 

BT 

46 

2.02% 

NOT 

4 

0.18% 

CALL 

46 

2.02% 

GE 

4 

0.18% 

BF 

42 

1.85% 

BNNIL 

4 

0.18% 

SUB 

39 

1.71% 

FFB 

4 

0.18% 

CHECK 

32 

1.41% 

LSH 

3 

0.13% 

OR 

29 

1.27% 

RTAG 

3 

0.13% 

XLATE 

29 

1.27% 

NEQUAL 

1 

0.04% 

BNIL 

25 

1.10% 

STOP 

1 

0.04% 

LDIPR 

23 

1.01% 

INVAL 

1 

0.04% 

SEND2 

22 

0.97% 

LE 

1 

0.04% 

LDIP 

21 

0.92% 

MUL 

0 

0.00% 

SUSPEND 

20 

0.88% 

MULH 

0 

0.00% 

ASH 

15 

0.66% 

CARRY 

0 

0.00% 

WTAG 

14 

0.62% 

NEQ 

0 

0.00% 

Instruction 

Count 

Freq. 

Instruction 

Count 

Freq. 

Move 

31 .90% 

Bit  Field 

116 

DC 

440 

19.33% 

Fault 

90 

3.95% 

ALU 

256 

1 1 .25% 

Other 

64 

2.81% 

Branch 

221 

9.71% 

Assoc.  Table 

47 

2.07% 

NOP 

173 

7.60% 

STOP 

1 

0.04% 

Network 

142 

6.24% 

Total  2276 


The  above  table  includes  the  static  instaiction  frequencies  in  the  Cosmos  kernel  and  the  MDP  runtime  system. 
The  second  table  categorizes  the  instructions  according  to  their  kinds.  Each  DC  is  counted  twice  because  it  oc¬ 
cupies  as  much  space  as  two  normal  instructions.  173  NOPs  had  to  be  inserted  to  align  instructions  to  word 
boundaries  around  DCs  and  at  branch  entry  points. 


Instruction  Frequencies 

I  collected  data  on  the  frequencies  of  various  MDP  instructions  to  provide  another  estimate  of 
what  the  MDP  is  doing  most  of  the  time.  Table  7-4  shows  a  histogram  of  the  static  instruc¬ 
tion  use  in  Cosmos  and  the  MDP  runtime  routines,  while  Table  7-5  shows  dynamic  instruc¬ 
tion  use  in  the  cold-start  sort  trial  running  on  16  MDPs  on  an  input  value  of  100.  Combined 
with  the  results  from  Table  7-6,  which  show  the  memory  reference  frequencies,  these  tables 
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contain  enough  information  to  deduce  the  approximate*  number  of  cycles  taken  per  MDP  in¬ 
struction. 

As  shown  in  Table  7-6,  a  16-MDP  J-Machine  will  achieve  somewhere  between  1.87  and  3.48 
cycles  per  working  instruction  when  running  the  sort  program  on  an  input  of  100.  The  inter- 
ral-memory-only  cycles-per-working-instruction  number  varied  between  1.8  and  2.0  for  other 
trials,  while  the  external-memory  cycles-per-working-instruction  number  varied  between  3.0 
and  3.9. 

Table  7-5.  Dynamic  Instruction  Frequencies 


Instruction 

Count 

Freq. 

Instruction 

Count 

Freq. 

STOP 

1143604 

43.77% 

CHECK 

14401 

0.55% 

READ 

309073 

1 1 .83% 

EQ 

14293 

0.55% 

WRITE 

169577 

6.49% 

LT 

11816 

0.45% 

ROT 

78272 

3.00% 

SEND2E 

11220 

0.43% 

READR 

76641 

2.93% 

NEG 

10886 

0.42% 

AND 

71150 

2,72% 

RTAG 

9859 

0.38% 

XLATE 

67230 

2.57% 

SENDE 

9594 

0.37% 

DC 

63981 

2.45% 

XOR 

5637 

0.22% 

BF 

53595 

2.05% 

FFB 

5624 

0.22% 

SEND 

47474 

1 .82% 

NOT 

5444 

0.21% 

ASH 

43161 

1 .65% 

EQUAL 

4725 

0.18% 

OR 

41948 

1.61% 

LE 

4296 

0.16% 

BR 

40800 

1.56% 

ENTER 

1117 

0.04% 

WRITER 

39085 

1.50% 

BNZ 

886 

0.03% 

SEND2 

30783 

1.18% 

LSH 

786 

0.03% 

ADD 

26468 

1.01% 

GE 

502 

0.02% 

NOP 

23633 

0.90% 

BZ 

475 

0.02% 

SUB 

21840 

0.84% 

BNNIL 

420 

0.02% 

BNIL 

20786 

0.80% 

MUL 

200 

0.01% 

SUSPEND 

20679 

0.79% 

PROBE 

74 

0.00% 

WTAG 

20406 

0.78% 

NEQUAL 

28 

0.00% 

BT 

19497 

0.75% 

NEQ 

0 

0.00% 

LDIP 

19032 

0.73% 

MULH 

0 

0.00% 

CALL 

18557 

0.71% 

CARRY 

0 

0.00% 

LDIPR 

17712 

0.68% 

HALT 

0 

0.00% 

GT 

15714 

0.60% 

INVAL 

0 

0.00% 

Instruction 

Count 

Freq. 

Instruction 

Count 

Freq. 

STOP 

1143604 

43.77% 

Assoc.  Table 

68421 

2.62% 

Move 

594376 

22.75% 

DC 

63981 

2.45% 

ALU 

283732 

10.86% 

Fault 

55301 

2.12% 

Branch 

136459 

5.22% 

NOP 

23633 

0.90% 

Bit  Field 

123724 

4.73% 

Other 

0 

0.00% 

Network 

119750 

4.58% 

Foreground 

1469377 

56.23% 

Total 

2612981 

This  particular  problem  (Sort  creating  and  sorting  an  array  of  100  numbers  on  16  MDPs)  only  kept  an  average  of 
56%  of  the  MDPs  busy  at  a  time — about  44%  of  the  instructions  executed  are  STOP  Although  the  frequency  of 
the  STOP  instruction  varies  widely,  the  relative  frequencies  of  the  other  instructions  are  typical  for  an  MDP  pro¬ 
gram. 


’Some  of  the  instruction  row  buffer  dynamics  were  simplified  and  all  branches  were  assumed  to  take  3  cycles,  even 
though  sometimes  they  may  take  fewer  cycles. 


112 


Chapter  7 


Performance  Measurements 


Table  7-6.  Memory  Access  Frequencies 

Operating  System  memory  usage: 

Reads:  394430  (0.15/instruction,  0.27/wort4ing  instruction) 

Writes:  1 52756  (0.06/instruction.  0.1 0/working  instruction) 

Fetches:  2295682  (0.88/instajction,  1.56/working  instruction) 

Heap  memory  usage: 

Reads:  1 52262  (0.06/instruction.  0.1 0/working  instruction) 

Writes:  138807  (0.05/instruction.  0.09/working  instruction) 

Fetches:  317299  (0.12/instruction,  0.22/wofking  instruction) 

Total  memory  usage: 

Reads:  546692  (0  21 /instruction,  0  37/working  instruction) 

Writes:  291563  (0.1 1 /instruction,  0.20/working  instruction) 

Fetches:  2612981  (1.00/instruction.  1.78/working  instruction) 

3.48  cycles/working  instruction 

1.87  cycles/working  instruction  without  external  RAM 

The  numbers  above  indicate  the  number  of  memory  references  (reads,  writes,  and  fetches)  done  to  the  operating 
system  (everything  except  the  heap)  and  heap  areas  of  memory  by  Sort  running  on  16  MDPs  with  an  input  of  100. 
the  numbers  for  the  other  sample  programs  are  similar.  The  cycles  per  instruction  figures  were  calculated  by 
adding  the  instruction  frequencies  from  Table  7-5  weighted  by  the  instruction  times  together  with  the  memory 
usage  frequencies  weighted  by  memory  access  times. 

The  4096-word  internal  memory  contains  all  of  the  operating  system  data  and  code  and  a  small  portion  of  the 
heap  (about  2100  words).  The  rest  of  the  heap  (65536  words)  lies  in  slow  external  memory.  When  running  on  a 
real  J-Machine,  the  sort  program  will  achieve  somewhere  between  1.87  and  3.48  cycles  per  working  instruction 
depending  on  how  much  of  the  program  and  data  resides  in  the  internal  memory  portion  of  the  heap. 

Considering  that  internal  memory  read,  write,  and  fetch  times  average  1,  0,  and  1/8  cycles*, 
respectively,  while  external  memory  read,  write,  and  fetch  times  are  6,  5,  and  3  cycles*,  re¬ 
spectively,  a  loss  of  only  a  factor  of  two  in  performance  by  placing  the  user  program  and  data 
in  external  memory  is  surprisingly  low.  The  reason  for  such  a  low  cycles-per-working-in- 
struction  figure  when  the  user  program  and  data  are  in  external  memory  is  the  high  Cosmos 
overhead.  The  MDP  spends  most  of  its  time  executing  Cosmos  code,  which  decreases  the  cy- 
cles-per-working-instruction  number  from  what  it  would  otherwise  have  been.  For  the  same 
reason,  changes  that  would  reduce  Cosmos  overhead  at  the  expense  of  user  program  size  are 
undesirable  in  most  cases. 


*The  write  time  is  0  because  it  is  absorbed  by  the  execution  of  the  write  instruction — WRI TE  does  not  require  any 
extra  cycles  when  writing  to  memory  as  opposed  to  a  register.  Eight  instructions  can  be  fetched  in  one  cycle  for  an 
effective  fetch  lime  of  1/8  cycle  per  instruction;  the  branch  instruction  cycle  counts  already  include  the  overhead  for 
fetching  the  next  set  of  instructions. 

*Two  instructions  arc  fetched  at  a  time  from  external  memory  in  6  cycles,  for  an  effective  fetch  time  of  3  cycles  per 
instruction. 
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Context  Switching  Performance 

A  large  component  of  the  current  operating  system  overhead  time  is  the  time  taken  to  save 
and  restore  contexts,  especially  in  the  CFUT  fault  handler.  One  possibility  to  increase  the 
speed  of  the  CFUT  fault  handler  is  to  not  save  data  registers  and  not  copy  the  message  upon 
a  CFUT  fault  [11].  Not  saving  data  registers  would  reduce  the  fault  handler’s  time  by  4  in¬ 
structions*,  while  not  copying  the  message  would  reduce  it  by  6  more  instructions.  However, 
these  gains  would  come  at  a  price — the  size  of  the  object  code  would  increase  because  the 
compiler  could  not  effectively  allocate  variables  to  registers;  it  is  not  clear  whether  the 
savings  in  the  operating  system  overhead  would  outweigh  the  increased  time  spent  executing 
user  code,  especially  if  the  user  code  lies  in  external  DRAM,  while  the  operating  system  lies 
in  fast  internal  SRAM. 

Summary 

Both  the  derived  and  measured  data  indicate  that  the  grain  size  for  running  Concurrent 
Smalltalk  on  the  J-Machine  is  50  to  70  instructions.  Since  most  functions  involve  two  mes¬ 
sages  (one  apply  message  and  one  return  message),  the  average  number  of  instructions  needed 
to  process  a  function  call  is  between  100  and  140;  actually,  it  is  probably  closer  to  100  be¬ 
cause  of  tail  forwarding. 

When  running  entirely  from  internal  memory,  the  MDP  executes  one  instruction  about  every 
two  cycles;  if  user  programs  and  data  have  to  be  accessed  from  external  memory,  that  count 
increases  to  about  four  cycles  per  instruction.  The  network  load  was  calculated  assuming  a 
fast  program  (two  cycles  per  instruction)  injecting  messages  into  the  network  at  the  fastest 
observed  rate  (one  word  every  eight  instructions)  and  utilizing  100%  of  the  J-Machine’s  pro¬ 
cessors.  If  the  messages  are  sent  randomly  under  the  above  conditions,  the  J-Machine  net¬ 
work  will  saturate  when  a  J-Machine  with  over  343  MDPs  is  built.  Of  course,  most  programs 
will  not  be  as  fast,  but  some  crafted  library  routines  could  impose  network  loads  as  high  as 
indicated  above  To  prevent  network  saturation,  either  the  network  will  have  to  be  made 
faster,  the  program  slower,  or  some  means  of  exploiting  locality  invented. 


*The  reduction  would  be  8  instructions  if  the  data  registers  did  not  have  to  be  restored  by  the  reply  handler;  how¬ 
ever,  it  is  diiTicult  for  the  reply  handler  to  distinguish  the  ca.scs  in  which  it  has  to  restore  registers  because  some 
unanticipated  fault  like  overflow  happened  from  the  cases  in  which  it  doesn’t;  the  extra  instructions  needed  to  make 
this  decision  would  make  this  optimization  not  worthwhile. 
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Although  working  Concurrent  Smalltalk  programs  have  been  demonstrated,  the  Concurrent 
Smalltalk  programming  system  is  by  no  means  complete.  Some  suggestions  for  improve¬ 
ments  were  discussed  throughout  the  previous  chapters — more  optimizations  could  be  added 
to  the  compiler,  distributed  objects  could  be  distributed  more  uniformly,  and  storage  used  by 
free  BRAT  entries  and  free  standard  contexts  could  be  placed  back  into  the  heap’s  free  stor¬ 
age  pool. 

Nevertheless,  the  possible  modifications  are  by  no  means  limited  to  the  minor  ones  listed 
there.  The  Concurrent  Smalltalk  programming  system  is  still  an  evolving  research  and 
demonstration  vehicle,  and  many  issues  still  have  to  be  addressed  before  it  ^comes  a  truly 
general-purpose  system.  This  chapter  lists  these  issues  together  with  potential  approaches 
for  addressing  them. 

The  first  section  lists  features  that  were  left  out  of  the  Concurrent  Smalltalk  implementation 
that  are  desirable  in  a  full  system.  These  features  are  useful  in  many  specialized  applica¬ 
tions,  but  the  system  can  work  without  them. 

The  second  section  lists  the  resource  management  concerns  raised  by  the  implementation  of 
Cosmos.  These  concerns  include  load  balancing,  garbage  collection,  name  space  reuse,  fanout 
bottlenecks,  and  parallelism  control.  A  few  ideas  are  suggested  about  handling  the  fanout 
bottleneck  and  parallelism  control  problems,  but  many  of  these  issues  are  still  in  the  re¬ 
search  stage. 

The  third  section  outlines  a  few  changes  that  could  be  made  to  the  MDP  architecture  that 
would  improve  the  performance  of  Cosmos  and  compiled  Concurrent  Smalltalk  programs. 
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8.1.  Features 

This  section  lists  additional  features  that  would  be  desirable  in  the  Concurrent  Smalltalk 
environment.  The  most  obvious  ones  are  the  current  omissions  from  Cosmos:  futures,  ar¬ 
rays,  floating  point  numbers,  and  overriding  primitive  methods.  In  addition,  the  perfor¬ 
mance  of  Concurrent  Smalltalk  loops  could  be  improved. 

Arrays 

Arrays  are  already  fully  implemented  in  Optimist  II — Optimist  II  can  interpret  and  compile 
code  containing  arrays.  Cosmos,  however,  does  not  currently  support  arrays.  When  imple¬ 
mented,  they  will  be  added  in  the  form  of  MDP  runtime  code  in  the  Runtime. m  Cosmos  file. 
Ideally  four  different  kinds  of  arrays  will  be  provided:  strings,  bit  arrays,  integer  arrays,  and 
general  object  arrays.  Strings  can  pack  four  characters  per  word,  bit  arrays  can  pack  thirty- 
two  booleans  per  word,  while  integer  arrays  can,  depending  on  the  range  of  integers  supplied, 
pack  1,  2,  4,  8,  16,  or  32  integers  per  word. 

I  expect  arrays  to  be  placed  in  self-contained  objects  fitting  on  single  nodes  rather  than  trees. 
This  will  limit  arrays  to  about  200  words  each  because  larger  objects  will  overflow  message 
queues  when  migrated.  If  large  arrays  are  desired,  distributed  array  classes  should  be  de¬ 
fined,  and  perhaps  new-simple-array,  new-integer-array,  new-string,  and  new- 
boolean-array  could  automatically  allocate  distributed  arrays  if  their  size  arguments  are 
large  enough. 

Enough  primitives  have  been  provided  in  Concurrent  Smalltalk  to  support  almost  all  com¬ 
mon  array  operations  efficiently.  The  map  and  init  methods  treat  arrays  dataflow-style,  al¬ 
lowing  elements  of  arrays  to  be  defined  in  terms  of  other  elements  of  the  arrays.  Cfutures 
could  be  used  in  unpacked  arrays  to  prevent  elements  of  arrays  from  being  read  before  writ¬ 
ten;  if  an  array  is  packed,  a  bitmap  of  valid  elements,  perhaps  stored  in  a  context,  could  be 
attached  to  it. 

Although  implementing  arrays  well  on  the  J-Machine  is  not  particularly  difficult,  it  is  quite 
time-consuming  and  was  omitted  from  this  thesis  for  this  reason. 

Overriding  Primitive  Selectors 

Concurrent  Smalltalk  allows  user  programs  to  override  primitive  selectors  such  as  +  and  <, 
thereby  allowing  the  implementation  of  additional  number  types  such  as  complex  numbers 
and  matrices  which  respond  to  the  traditional  numeric  operations.  While  Optimist  II  permits 
selectors  to  be  overridden  in  its  interpreter.  Cosmos  does  not  support  this  facility,  again  be¬ 
cause  this  feature  would  be  too  time-consuming  to  implement. 

Adding  the  ability  to  override  primitive  selectors  will  not  be  as  easy  as  adding  arrays,  but, 
fortunately,  all  the  hardware  building  blocks  needed  are  present  in  Architecture  llB.  When 
an  instruction  is  executed  on  a  word  with  a  type  not  supported  by  the  hardware  for  that  in¬ 
struction,  the  MDP  faults.  When  a  system  call  such  as  Divide  is  done  on  words  with  unsup¬ 
ported  types,  the  operating  system  halts.  All  of  the  type-related  fault  handlers  and  halts  will 
have  to  be  implemented;  they  will  have  to  decode  the  operation  which  caused  the  fault  and 
emulate  it  by  performing  a  standard  message  send.  This  emulation  will  require  a  lot  of  at¬ 
tention  to  little  details  and  will  be  error-prone.  Also,  the  context  will  have  to  be  enlarged. 

For  an  example  of  the  complexity  involved,  suppose  the  user  overrides  the  =  method  to  sup¬ 
port  complex  numbers.  One  of  the  consequences  might  be  that  a  BNZ  instruction  somewhere 
in  the  program  faults  id  because  it  was  called  on  a  complex  number  instead  of  an  integer. 
The  fault  handler  will  have  to  call  the  =  method  to  compare  the  complex  number  against 
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zero.  In  order  to  make  this  call,  it  has  to  save  the  entire  state  of  computation  in  the  context 
plus  two  more  words:  a  return  IP  back  to  the  fault  handler  and  a  slot  into  which  the  result 
(true  or  false)  should  be  written.  When  the  fault  handler  regains  control,  it  will  examine 
the  slot  and  either  take  the  branch  in  the  user  program  or  let  execution  continue  with  the 
next  instruction.  Due  to  CFUT-handling  in  the  MDP’s  architecture  (specifically,  because 
WRITE  does  not  fault  on  cfutures),  primitive  selectors  can  never  return  cfutures. 

Another  issue  is  what  to  do  with  commutative  operations  such  as  +.  One  might  add  an  inte¬ 
ger  to  a  complex  number  or  a  complex  number  to  an  integer,  and  it  would  be  nice  not  to  have 
to  override  the  integer  method  for  +  to  implement  complex  numbers.  To  implement  this 
cleanly,  the  fault  handler  for  add  would  have  to  try  adding  the  arguments  in  one  order,  and, 
if  no  method  matched,  reverse  the  arguments  and  try  again.  If  no  method  matched  a  second 
time,  it  would  halt. 

Finally,  a  few  minor  modifications  may  have  to  be  done  to  Optimist  IPs  back  end  to  support 
overriding  primitive  selectors. 

Long  Integers 

Once  overriding  primitive  selectors  is  supported,  it  will  not  be  particularly  difficult  to  imple¬ 
ment  a  bignum  package  for  the  MDPs  and  watch  how  many  microseconds  it  takes  a  J-Ma- 
chine  to  compute  the  factorial  of  1000. 

Futures 

Optimist  II  currently  provides  most  of  the  support  needed  for  full  futures,  although  some 
modifications  would  still  be  necessary.  The  major  changes  would  be  to  the  operating  system. 
The  changes  would  be  similar  to  those  needed  to  implement  primitive  selector  overriding — 
FUT  fault  vectors  would  have  to  be  defined  and  emulate  all  possible  cases. 

Floating  Point  Numbers 

There  are  four  different  ways  to  implement  floating  point  facilities  on  the  J-Machine.  Rang¬ 
ing  from  the  easiest  to  the  hardest  and  most  exotic,  they  are; 

1.  Emulate  operations  on  the  FLOAT  data  type  through  software  fault  handlers.  This  ap¬ 
proach  would  provide  IEEE-compatible*,  single-precision  floating  point  number  capability. 
Unfortunately,  this  approach  would  be  very  slow  because  of  the  large  instruction  decoding 
and  floating  point  packing  and  unpacking  overheads.  The  advantages  of  this  approach  are 
simplicity,  transparency,  and  IEEE  compatibility,  if  desired. 

2.  Store  floating  point  numbers  as  two  words  each.  One  word  would  be  the  exponent  and 
the  other  the  mantissa.  The  precision  would  be  intermediate  between  single  precision  and 
double  precision.  Floating  point  operations  could  be  inlined,  and  micro-optimization  tech¬ 
niques  [12]  could  be  applied.  The  advantage  of  this  approach  is  speed  without  the  need  for 
extra  hardware.  The  disadvantages  are  that  this  approach  would  need  object  inlining  to  be 
implemented  by  Concurrent  Smalltalk  (otherwise  this  technique  would  be  even  slower  than 
technique  1),  the  floating  point  number  format  is  nonstandard,  and  the  use  of  floating  point 
numbers  would  be  cumbersome.  Since  floating  point  variables  would  take  two  words  instead 
of  one,  they  would  have  to  be  declared  as  such  to  avoid  losing  efficiency,  and  a  variable  could 
not  efficiently  support  both  floating-point  and  non-floating-point  values.  The  last  restriction 
is  a  major  problem  because  floating-point  versions  of  all  of  the  methods  operating  on  general 
objects  might  have  to  be  written  and  used  to  achieve  good  performance. 

3.  The  third  possibility  would  be  the  inclusion  of  a  floating-point  unit  on  the  MDP.  The  unit 
would  require  no  significant  software-visible  architectural  changes;  the  arithmetic  instruc- 


’ANSI-IEEE  standard  754. 
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tions  would  simply  start  working  on  words  tagged  FLOAT,  and  maybe  a  Div  instruction  and  a 
few  control  registers  would  appear.  The  disadvantage  of  this  approach  would  be  the  inclu¬ 
sion  of  a  hardware  floating  point  unit  on  the  MDP,  which  would  increase  the  hardware’s 
complexity.  The  advantages  would  be  speed,  simplicity,  transparency,  and  IEEE  compatibil¬ 
ity,  if  desired. 

4.  The  last  possibility  would  be  addition  of  RAP  [19]  chips  to  the  J-Machine  network.  RAP 
chips  are  custom  chips  that  contain  a  large  number  of  serial  floating  point  units,  achieving 
estimated  peak  performance  of  300  MFLOPS  per  chip.  Under  this  approach,  an  MDP  would 
send  floating  point  calculations  to  a  friendly  neighborhood  RAP,  which  would  do  the  calcula¬ 
tions  and  respond  to  the  continuation  it  was  given.  This  approach  would  work  well  if  the 
floating  point  calculations  were  grouped  and  did  not  have  to  be  mixed  with  symbolic  process¬ 
ing.  If  the  MDPs  were  to  perform  mainly  symbolic  processing  with  occasional  floating  point 
instructions,  the  message  overheads  would  make  this  approach  inefficient.  This  approach 
would  require  a  large  investment  in  operating  system  and  runtime  software,  and  it  is  not 
immediately  clear  that  it  would  be  faster  than  approach  3,  although  the  potential  payoff  is 
large. 

True  Loops 

Loops  are  currently  not  implemented  particularly  efficiently  in  Concurrent  Smalltalk.  It  is 
not  clear  whether  this  inefficient  implementation  will  hurt  program  performance;  it  does,  of 
course,  depend  on  how  often  loops  are  used.  Using  iterators  and  similar  abstractions  to  step 
through  arrays  and  other  data  structure  is  usually  preferred  to  using  loops  because  iterators 
might  execute  in  parallel,  while  loops  are  inherently  sequential.  Nevertheless,  there  might 
be  some  situations  where  sequential  loops  are  needed. 

The  primary  reason  for  the  current,  inefficient  implementation  of  loops  is  the  need  to  ensure 
that  a  loop  does  not  execute  for  a  long  time  uninterrupted,  preventing  other  messages  from 
being  executed  at  the  node  and  maybe  even  causing  a  message  queue  overflow.  Currently 
Optimist  II  compiles  a  loop  into  a  function  which  calls  itself  tail-recursively,  which  is  a  fairly 
large  penalty  to  pay  for  tight  loops.  The  implementation  could  be  improved  to  a  true  loop  in¬ 
side  a  lambda  if  the  code  inside  the  loop  either  made  at  least  one  full-fledged  function  call  per 
iteration  or  tail-recursed  every  few  iterations;  either  case  takes  care  of  the  message  problem. 
Some  experimentation  is  needed  in  this  area  to  determine  the  best  course  of  action. 

Inline  Objects 

The  largest  feature  change  to  the  Optimist  II  compiler  would  be  the  addition  of  inline  objects. 
This  would  be  a  difficult  and  error-prone  process  because  all  cases  have  to  be  handled  well; 
these  cases  include  passing  an  inline  object  to  a  function  that  does  not  expect  one,  storing  in¬ 
line  objects  in  contexts,  creating  pointers  to  inline  objects,  and  altering  inline  objects.  It  is 
likely  that  if  inline  objects  were  implemented,  several  versions  of  each  function  would  be 
compiled.  One  version  would  be  unoptimized,  while  the  others  would  support  inline  objects 
as  arguments  and  results.  The  constant  folder  would  then  try  to  convert  unoptimized  func¬ 
tion  calls  to  optimized  function  calls  in  the  same  way  it  currently  converts  method  calls  to 
function  calls. 
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8.2.  Resource  Management 

Concurrent  Smalltalk  presents  the  programmer  with  an  ideal  model  of  a  machine  with  an 
unlimited  number  of  processors  and  an  unlimited  amount  of  memory;  unfortunately,  real 
computers  are  limited  in  both  the  number  of  processors  and  the  size  of  memory.  Several  re¬ 
source  management  problems  result  from  the  discrepancy  between  the  Concurrent  Smalltalk 
ideal  and  the  hardware  reality.  These  problems  include  reusing  memory  that  can  no  longer 
be  accessed  and  simulating  an  unlimited  number  of  processors  with  a  Hxed,  finite  number. 
Additionally,  there  are  a  few  bottlenecks  in  the  current  system  that  can  be  ignored  in  small 
implementations  but  will  become  important  in  large-scale  systems. 

Heap  Compaction 

The  current  design  of  the  Cosmos  heap  compactor  compacts  the  entire  MDP  heap  when  a 
storage  allocation  request  exceeds  available  free  memory.  This  approach  works,  but  it  has 
two  significant  disadvantages,  both  related  to  the  long  time  it  takes  to  compact  the  memory: 

1.  On  a  small  J-Machine,  the  MDP  will  effectively  stop  responding  until  the  heap  com¬ 
paction  is  done.  In  the  few  tens  of  thousands  of  instruction  it  takes  the  MDP  to  do  the  heap 
compaction,  the  other  MDPs  may  run  out  of  things  to  do  and  all  wait  for  the  stopped  MDP. 
The  heap  compaction  will  effectively  stop  the  entire  computer.  Soon  after  the  first  MDP  fin¬ 
ishes  its  heap  compaction,  another  MDP  may  starts  its  own  compaction,  and  the  process  will 
repeat. 

2.  On  a  large  J-Machine,  a  heap  compaction  on  one  MDP  will  not  be  enough  to  stop  the 
other  MDPs  from  running;  instead,  they  will  continue  to  run  longer  and  are  likely  to  send 
enough  messages  to  the  compacting  MDP  that  its  incoming  message  queue  overflows.  The 
poor  MDP  now  does  not  know  what  to  do  because  it  has  no  free  memory  into  which  to  put  the 
extra  messages. 

Finally,  the  current  heap  compactor  does  not  compact  BRAT  entries  or  standard  contexts, 
but  it  could  compact  them  with  a  little  additional  effort. 

An  incremental  heap  compactor  would  address  both  of  the  serious  disadvantages  of  the  cur¬ 
rent  heap  compactor.  It  might  even  be  possible  to  run  the  incremental  heap  compactor  in  the 
MDPs  background  mode,  although  the  lack  of  a  separate  set  of  fault  vectors  and  a  full  set  of 
registers  would  pose  serious  detriments. 

Fanout  Bottlenecks 

Cosmos  currently  assigns  one  node  as  a  “home”  of  an  object;  with  few  exceptions,  if  a  differ¬ 
ent  node  needs  a  copy  of  that  object,  it  turns  to  the  home  node  to  get  it,  and  the  home  node 
takes  care  of  supplying  the  object.  Unfortunately,  sometimes  many  nodes  want  to  use  the 
same  object  simultaneously.  Accesses  to  mutable  objects  are  serialized  anyway,  so  having  a 
home  node  for  a  mutable  object  is  not  such  a  bad  idea;  however,  there  is  no  reason  why  ac¬ 
cesses  to  immutable  objects  should  be  unnecessarily  serialized.  On  the  contrary,  functions 
are  immutable  objects,  and  it  would  be  nice  if  a  function’s  home  node  did  not  have  to  send  a 
copy  of  the  code  to  every  other  node  on  a  65536-MDP  computer. 

One  solution  to  this  bottleneck  would  be  to  assign  several  home  nodes  to  each  immutable  ob¬ 
ject;  perhaps  the  more  popular  the  object  would  be,  the  more  home  nodes  it  would  have. 
When  another  node  needed  a  copy  of  that  object,  it  could  ask  the  closest  home  node.  If  one 
home  node  were  made  special  and  all  the  others  allowed  to  purge  their  copies  of  the  object 
because  they  could  get  it  from  the  special  home  node,  this  scheme  would  become  a  distribu¬ 
tion  tree.  Brian  Totty  presents  an  analysis  of  distribution  trees  in  [381. 
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Cosmos  also  serializes  the  allocation  of  distributed  objects  at  one  node  because  of  the  need  to 
give  each  distributed  object  a  unique  ID.  The  allocation  process  could  be  parallelized  by 
splitting  the  ID  space  and  making  several  nodes  responsible  for  allocating  distributed  ob¬ 
jects,  one  for  each  chunk  of  addressing  space. 

Garbage  Collection 

Garbage  collection  on  the  J-Machine  is  currently  an  open  research  problem.  Parallel  garbage 
collection  algorithms  exist,  but  they  may  not  work  well  on  the  J-Machine.  For  example,  the 
parallel  garbage  collection  algorithm  in  [29]  requires  a  node  to  keep  track  of  all  of  the  local 
IDs  it  sends  to  other  nodes,  which  would  be  unfeasible  for  two  reasons.  First,  each  MDP 
spends  a  considerable  amount  of  its  time  sending  data  onto  the  network,  and  its  performance 
would  suffer  if  it  had  to  record  every  ID  sent.  Second,  most  local  IDs  become  known  to  other 
nodes  in  the  J-Machine,  degenerating  the  algorithm’s  performance. 

Perhaps  the  best  solution  is  a  simple  mark-and-sweep  algorithm  run  on  all  MDPs  in  parallel; 
after  all,  the  combined  MDPs  have  a  considerable  amount  of  processing  power.  Unfortu¬ 
nately,  this  approach  has  three  potential  problems: 

1.  The  mark-and-sweep  garbage  collector  has  to  stop  the  J-Machine,  and  it  might  be  diffi¬ 
cult  to  stop  all  processors  and  allow  the  messages  in  the  network  to  land  somewhere,  espe¬ 
cially  if  the  messages  in  the  network  are  blocked  because  some  node  is  out  of  memory  and 
queue  space. 

2.  The  J-Machine  network  bandwidth  may  be  insufficient  for  a  mark-and-sweep  garbage 
collection. 

3.  There  may  not  be  enough  room  on  the  MDPs  for  the  intermediate  storage  needed  by  the 
algorithm.  In  particular,  if  all  the  MDPs  immediately  start  marking  their  root  sets,  all  mes¬ 
sage  queues  will  quickly  overflow  with  mark  messages.  This  is  a  parallelism  control  prob¬ 
lem. 

Load  Management 

The  purpose  of  load  management  on  the  J-Machine  is  to  distribute  a  parallel  computation 
evenly  throughout  the  processors  while  keeping  network  congestion  low.  Load  management 
is  a  very  broad  current  research  area.  Cosmos  and  Optimist  II  include  limited  attempts  to 
balance  the  load — Optimist  II  distributes  the  objects  it  compiles  evenly  among  the  nodes  of 
the  J-Machine,  and  Cosmos  allocates  new  objects  on  random  nodes  and  evaluates  applica¬ 
tions  on  primitive  objects  on  random  nodes  to  prevent  the  entire  computation  from  taking 
place  on  one  node.  Nevertheless,  these  are  only  initial  steps  to  addressing  the  load  manage¬ 
ment  issues.  The  following  are  at  least  some  of  the  load  management  concerns  that  should 
be  addressed  on  a  large  J-Machine: 

•  The  current  system  for  allocating  objects  may  have  to  be  reevaluated.  At  least  theoreti¬ 
cally,  the  current  system  should  perform  quite  well  if  all  objects  are  about  the  same  size.  If 
the  nodes  on  which  objects  are  allocated  are  always  picked  randomly,  memory  usage  on  all 
nodes  will  remain  within  a  few  standard  deviations  of  the  average  memory  usage,  so  even  on 
a  large  J-Machme  the  probability  that  a  single  node’s  memory  overflows  can  be  made  expo¬ 
nentially  small.  On  the  other  hand,  real  programs  may  allocate  objects  with  a  large  variation 
of  sizes,  and  they  may  wish  to  allocate  objects  on  specific  nodes  to  take  advantage  of  locality. 
Both  of  these  conditions  may  overflow  memory  on  some  nodes  while  other  nodes  still  have  a 
considerable  amount  of  free  memoi^. 

•  An  analogous  issue  to  the  one  above  is  handling  message  queue  overflows.  Due  to  the 
queues’  small  sizes  and  the  large  variance  in  the  sizes  of  messages,  it  is  difficult  to  make 
queue  overflows  statistically  unlikely.  Instead,  mechanisms  have  to  be  introduced  to  handle 
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them.  These  mechanisms  should  not  allocate  extra  local  memory  because  a  queue  overflow  is 
most  likely  to  happen  when  little  or  no  memory  is  available  because  it  is  being  compacted. 

•  MDPSim  assumes  that  the  MDPs  are  connected  by  a  crossbar  network,  so  all  MDPs  are 
equally  far  apart  from  each  other.  This  is  a  good  approximation  on  a  small  J-Machine — on  a 
64'node  J-Machine  organized  as  4x4x4  MDPs,  no  two  processors  are  more  than  9  links  apart, 
while  the  expected  distance  between  two  random  nodes  is  only  3  links.  On  the  other  hand, 
on  a  65536-node  J-Machine  organized  as  64x32x32,  locality  becomes  an  important  issue;  if 
objects  continue  to  be  allocated  randomly,  the  network  will  become  hopelessly  congested. 

There  are  two  general  approaches  to  distributing  the  load  evenly.  One  approach  is  to  make 
objects  very  mobile  and  hope  that  they  will  redistribute  themselves  to  exploit  locality.  When 
a  portion  of  the  J-Machine  becomes  congested,  it  could  simply  throw  objects  at  the  rest  of  the 
J-Machine.  JOSS  hints  [38]  were  an  example  of  a  technique  that  could  be  used  by  this  ap¬ 
proach.  While  this  approach  is  simple,  it  does  suffer  from  some  disadvantages.  In  particular, 
if  load  management  decisions  are  made  often,  they  cannot  be  too  time-consuming  to  prevent 
excessive  overhead.  Also,  when  an  object  migrates  often,  it  is  difficult  for  a  node  to  send  a 
message  to  it.  In  JOSS,  if  a  node  does  not  know  where  an  object  is,  it  sends  the  message  to 
the  object’s  home  node  instead,  which  forwards  it  to  the  object.  If  an  object  is  not  at  the  home 
node,  then  both  the  home  node  and  the  object’s  current  node  are  congested  with  messages 
addressed  to  the  object.  JOSS  attempted  to  correct  this  problem  through  the  use  of  hints,  but 
JOSS-style  hints  may  be  ineffective  because  all  first-time  users  of  an  object  must  still  first 
reference  the  home  node  to  get  to  the  object. 

The  other  approach  is  making  objects  on  the  J-Machine  relatively  static  and  redistributing 
them  to  balance  the  load  only  occasionally.  This  is  the  approach  taken  in  Cosmos.  Objects 
are  free  to  move  around  the  J-Machine  for  short  periods  of  time,  but  an  object’s  home  node 
asks  the  object  to  return  to  it  when  it  another  node  sends  a  message  to  the  object  via  the 
home  node.  Hence,  objects  tend  to  remain  where  they  were  first  created.  As  long  as  the  ob¬ 
ject  allocator  allocates  objects  well,  the  load  will  remain  roughly  balanced.  Any  small  dy¬ 
namic  imbalances  that  arise  can  be  handled  by  the  garbage  collector,  which  could  have  the 
power  to  truly  change  an  object’s  home  node  by  renaming  all  of  the  IDs  in  the  entire  J-Ma¬ 
chine  pointing  to  the  object. 

Controlling  Parallelism 

In  addition  to  load  balancing,  which  distributes  a  fixed  amount  of  work  among  the  MDPs,  it 
will  also  be  necessary  to  throttle  the  amount  of  work  being  done  by  the  J-Machine  as  a  whole. 
A  simple  example  illustrates  this  point. 

(Defmethod  fib  Integer  () 

(if  (<=  self  2) 

1 

(+  (fib  (-  self  D)  (fib  (-  self  2))))) 

Figure  8-1.  A  Doubly-Recursive  Fibonacci  Program 

Consider  the  doubly-recursive  Fibonacci  program  in  Figure  8-1.  When  run  on  a  sequential 
computer,  the  program  traverses  the  computation  tree  of  the  Fibonacci  function  in  a  depth- 
first  order  (Figure  8-2),  taking  only  0(n)  space  but  exponential  time  to  compute  Fib(n).  On 
the  other  hand,  when  run  on  the  J-Machine,  each  invocation  of  fib  except  the  tail  ones  at¬ 
tempts  to  evaluate  the  two  recursive  calls  in  parallel.  In  effect,  the  computer  traverses  the 
computation  tree  in  breadth-first  order  (Figure  8-3).  This  is  good  if  there  are  many  proces¬ 
sors,  because  then  the  function  is  computed  in  only  0(n)  time.  Unfortunately,  this  manner  of 
computation  requires  an  exponential  amount  of  both  main  memory  and  message  queue 
space.  Thus,  a  parallel  computer  can  fail  if  a  program  exhibiting  too  much  paralleli.sm  is  run 
on  it. 
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Figure  8-2.  Progress  of  a  Sequential  Computation 

Although  the  computation  consists  of  a  large  number  N  of  function  invocations,  a  sequential  computer  traverses 
the  computation  tree  in  depth-first  order,  so  only  0(log  N)  functions  are  active  at  any  particular  time  (bold  gray), 
and  the  “wavefront*  of  computation  consists  of  only  a  single  invocation  (bold  black)  O(log  A/)space  is  required  to 
run  the  program  and  constant-size  message  queues  suffice  because  the  wavefront  is  at  most  one  invocation. 


Figure  8-3.  Progress  of  a  Parallel  Computation 

A  parallel  computation  tends  to  evaluate  the  computation  tree  in  breadth-first  order,  which  requires  the  storage  of 
most  of  the  function  invocations  in  the  computation  tree  at  about  the  halt-way  point.  Thus,  the  computation  re¬ 
quires  space,  and,  moreover,  the  "wavefront"  can  also  become  as  large  as  0(N)  Hence,  the  computation 
also  requires  0(N)  message  queue  space  The  computation  will  exceed  the  parallel  computer's  memory  if  N  is 
large  compared  with  the  number  of  processors 
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When  the  compiled  code  for  Fibonacci  is  run  on  a  simulated  4-node  J-Machine,  Fib(ll)  is  the 
largest  value  that  can  be  computed.  An  attempt  to  compute  Fib(12)  results  in  queue  over¬ 
flows;  enlarging  the  message  queues  or  spilling  them  into  main  memory  would  not  help  much 
because  the  storage  needs  grow  exponentially. 

Fortunately,  it  appears  that  a  solution  to  this  problem  does  exist.  Why  not  change  from 
evaluating  the  computation  tree  in  a  breadth-first  fashion  to  a  depth-first  fashion  when  all  of 
the  processors  on  the  J-Machine  are  busy?  A  seven-instruction  change  to  the  compiled  code 
for  Fib  (Figure  8-4)  accomplishes  just  what  is  needed.  The  change  forces  sequential  evalua¬ 
tion  of  Fib’s  two  recursive  calls  if  the  local  message  queue  is  more  than  a  quarter  full.  Thus, 
the  computation  grows  exponentially  until  all  MDPs  are  saturated.  From  then  on  until  the 
answer  is  ready,  all  MDPs  are  busy  computing  the  problem  without  increasing  the  space  re¬ 
quirements.  .Ailer  the  change  was  made,  the  Fib  program  could  calculate  answers  for  much 
larger  inputs. 

The  simple  change  in  Figure  8-4  is  not  a  panacea,  though.  The  change  allows  enough  paral¬ 
lelism  for  the  message  queues  to  be  a  quarter  full  on  the  average  throughout  the  J-Machine. 
Unfortunately,  in  practical  simulations  the  sizes  of  the  queues  vary  widely — the  queues  on 
some  processors  might  be  empty,  while  other  MDPs  may  have  queues  that  are  more  than 
half  full.  It  is  easy  to  see  why  this  might  happen — an  MDP  with  a  nearly  empty  queue  is  not 
throttled  down  and  will  happily  send  messages  to  an  MDP  with  a  nearly  full  queue.  Due  to 
this  variance,  the  queues  overflowed  anyway  if  the  threshold  for  inhibiting  parallelism  was 
set  to  half  of  the  queue  size.  To  summarize,  it  seems  that  this  approach  for  controlling  paral¬ 
lelism  will  work,  but  it  may  have  to  be  combined  with  load  balancing  to  keep  the  variance  in 
queue  sizes  low. 

Name  Spaces 

The  scarcity  of  IDs  in  the  32-bit  name  space  is  also  an  important  consideration  on  the  J-Ma- 
chine.  After  allowing  for  flags  and  nonuniform  usage  of  the  name  space,  32  bits  allow  only 
about  a  billion  objects  to  be  named  on  the  J-Machine.  Furthermore,  if  the  name  space  is  not 
reused,  a  J-Machine  could  run  out  of  names  in  less  than  a  second — each  node  is  limited  to 
creating  only  about  32000  objects  before  exhausting  its  name  space. 

To  solve  this  problem,  object  IDs  could  be  collected  and  reused  by  the  garbage  collector*.  The 
garbage  collector  could  compact  the  ID  space,  which  would  also  permit  an  ID-renaming  load 
balancer  almost  for  free.  However,  even  this  approach  might  not  be  enough.  If  the  J- 
Machine  is  implemented  using  technology  of  the  1990’s,  it  may  well  have  enough  physical 
memory  to  overflow  the  32-bit  name  space  even  with  garbage  collection.  At  that  point  the 
only  reasonable  solution  will  be  to  increase  the  word  size,  perhaps  to  64  bits. 


*An  approach  that  almost  works  and  docs  not  require  a  garbage  collector  is  to  test  each  candidate  ID  in  the  ID- 
generation  routine.  If  the  ID  names  an  existing  object,  the  ID-gcnerator  simply  chooses  another  ID.  Unfortunately, 
this  approach  docs  not  work  for  immutable  objects  because  some  copies  of  such  objects  could  exist  with  the  home 
node  not  knowing  about  them.  Keeping  the  home  node  informed  about  copies  of  its  immutable  objects  would  cause 
bottlenecks  of  its  own,  not  the  least  of  which  are  the  space  needed  to  store  such  information  and  the  network  band¬ 
width  used  to  maintain  it. 
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MODULE  fFib 

DC  MSGrhdrCopyable  |  cFunction«of  f  setNI  32 

DC  {fFib} 

DC  5 

MOVE  [2,A3),R3 
LT  R3,2,R1 

BF  R1,^L001 

MOVE  1 , R1 
BR  ^L002 

LOOl:  SUB  R3,2,R2 

MOVE  k2,R0 
CALL  objectNode 
DC  MSG:msgApplyFunction| 5 

SEND20  R1,R0 
DC  {fFib} 

SEND20  R0,R2 
MOVE  5 , RO 
SEND2E0  (1,A1],R0 
WTAG  R0,6,R0 
MOVE  RO, [5,A1) 

MOVE  QBL.Rl 
WTAG  R1,INT,R1 
DC  63 

AMD  R1,$3FF,R1 
LE  IU,R0,R1 

BT  R1 ,  '^En^ty 

MOVE  [5,A11,R0 
Eapty:  SUB  R3,1,R2 

MOVE  R2 , RO 
CALL  objectNode 
DC  MSG:msgApplyFunction I  5 

SEND20  R1,R0 
DC  {fFib} 

SEND20  R0,R2 
MOVE  6,R0 
SEND2E0  [1,A11,R0 
WTAG  R0,6,R0 
MOVE  RO,  C6.A1] 

MOVE  [6,A1],R2 
ADD  R2,[5,A1],R1 
L002:  MOVE  [3,A31,R2 

BNIL  R2,^L003 
DC  MSG  :insgReply  M 

SEND20  R2,R0 
SENDO  R2 
SEND2E0  C4,A31,R1 
L003:  SUSPEND 

END 

Figure  8-4,  Modified  Fib  Assembly  Language  Function 

When  the  incoming  message  queue  is  at  least  a  quarter  full,  the  modilled  Fib  function  throttles  down  the  paral¬ 
lelism  by  waiting  until  the  result  of  the  first  recursive  call  has  been  received  before  starting  the  second  one.  The 
modification  is  shown  in  bold.  No  parallelism  penalty  other  than  the  execution  of  six  extra  instructions  is  paid  when 
the  J-Machine  is  not  saturated 
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8.3.  Architectural  Considerations 

Some  architectural  modifications  could  be  made  that  would  streamline  execution  of  MDP 
code  in  critical  sections  in  the  operating  system. 

Minor  Instruction  Set  Changes 

One  set  of  optimizations  with  a  relatively  large  payoff  would  be  allowing  MOVE  instructions 
from  the  ID  registers  directly  to  memory  and  XLATE  instructions  directly  from  memory  into 
address  and  ID  register  pairs.  Introducing  these  instructions  would  cut  the  number  of  in¬ 
structions  needed  to  save  and  restore  ID  registers  on  context  switches  by  half,  and  it  would 
accelerate  allocation  and  deallocation  of  fast  contexts. 

A  large  part  of  the  operating  system  is  still  spent  saving  and  restoring  state  in  fault  han¬ 
dlers.  AJso,  most  faults  point  the  FIP  register  to  the  instruction  after  the  one  that  faulted, 
while  most  fault  handlers  (with  the  notable  exception  of  CALL)  would  rather  resume  the  in¬ 
struction  that  faulted,  requiring  the  FIP  to  point  to  the  instruction  that  faulted.  Backing  up 
the  FIP  by  one  instruction  takes  five  or  seven  instructions  depending  on  whether  a  free  regis¬ 
ter  is  available.  Pointing  the  FIP  to  the  instruction  that  faulted  (except  for  CALL  faults)  or 
making  an  extra  shadow  FIP  register  that  points  to  the  instruction  that  faulted  would  reduce 
the  number  of  instructions  needed  in  important  fault  handlers  such  as  CFUT,  EARLY,  and 
SEND — the  EARLY  and  SEND  fault  handlers  would  be  reduced  from  eight  instructions  to 
one! 

Other  critical  resources  which  are  near  the  limits  of  their  capacities  are  the  message  queues 
and  the  XLATE  table.  The  message  queues  can  only  be  made  to  hold  1024  words,  and  the 
XLATE  table  cannot  hold  more  than  512  bindings.  If  a  MDP  has  65536  words  of  memory,  it 
might  be  beneficial  to  have  a  2048-binding  XLATE  table  or  a  message  queue  that  could  hold 
4096  words,  especially  if  large  objects  are  frequently  transmitted  over  the  network. 

Another  critical  resource  in  the  XLATE  table  is  the  key  space.  The  XLATE  table  is  a  popular 
associative  cache  in  the  operating  system,  and  it  is  used  for  a  variety  of  purposes.  Unfortu¬ 
nately,  there  are  only  16  tags  on  the  MDP,  and  tag  conflicts  exist  among  the  keys  XLATEd  by 
the  users  of  the  XLATE  table.  For  example,  class/selector  pairs  had  to  be  tagged  INSTI  be¬ 
cause  all  of  the  “normal”  tags  were  already  taken.  A  future  version  of  the  operating  system 
might  run  out  of  key  tags  for  the  XLATE  table.  Possible  solutions  to  this  problem  include, 
but  are  not  limited  to,  providing  several  XLATE  tables  or  using  more  than  one  word  as  a  key. 

Finally,  one  instruction  is  seldom  used  and  could  be  removed.  The  INVAL  instruction  is  used 
only  once  in  Cosmos  in  the  heap  compactor,  and  since  a  heap  compaction  takes  a  long  time 
anyway,  emulating  INVAL  in  software  would  neither  be  difficult  nor  harm  performance. 

Fast  Context  Saves  and  Restores 

Perhaps  a  more  ambitious  project  would  be  to  attempt  to  improve  the  MDPs  context-switch¬ 
ing  time  by  supporting  in  hardware  a  shadow  image  of  the  registers  in  memory.  In  other 
words,  the  registers  would  act  as  a  cache  for  a  context  in  memory.  When  a  context  switch  oc¬ 
curred,  the  modified  registers  would  be  written  back  into  memory  and  a  new  register  set 
loaded  from  the  new  context.  Quick  register  saving  and  restoring  for  fault  handling  is  even 
more  important  than  fast  context  switching,  and  this  approach  might  be  generalized  to  sup¬ 
port  fault  handling  as  well  by  allocating  a  context  to  each  fault  handler  that  wanted  one. 
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8.4.  Conclusion 

A  number  of  desirable  features  for  future  inclusion  in  Cosmos  or  Optimist  II  were  described, 
including  arrays,  full  futures,  overriding  primitive  selectors,  floating  point  numbers,  and 
large  integers.  Implementing  arrays  and  primitive  selector  override  facilities  should  not  pre¬ 
sent  migor  difficulties,  although  it  will  be  time-consuming.  Several  approaches  for  imple¬ 
menting  floating  point  numbers  were  discussed,  including  two  software  approaches — a  fast 
and  dirty  one  and  a  clean  but  slow  one — as  well  as  two  hardware  approaches — including  a 
floating  point  unit  on  every  MDP  and  including  RAP  chips  in  the  J-Machine  network. 

In  addition,  a  number  of  resource  management  issues  were  discussed,  ranging  from  heap 
compaction,  garbage  collection,  load  management,  and  ID  reuse  to  fanout  bottlenecks  and 
parallelism  control.  New  methods  may  have  to  be  developed  to  support  efficient  garbage  col¬ 
lection  on  the  J-Machine,  but  once  garbage  collection  is  done,  ID  reuse  and  load  management 
may  be  obtained  for  free.  Parallelism  control  is  a  serious  issue  in  many  applications.  An  ap¬ 
plication  that  tries  to  operate  on  a  large  data  set  in  parallel  or  explore  a  large  search  tree  will 
quickly  overflow  the  entire  J-Machine’s  queue  capacity.  One  approach  to  solving  this  prob¬ 
lem  was  explored — if  Concurrent  Smalltalk  code  switches  to  evaluating  itself  sequentially 
when  the  local  queue  size  exceeds  a  threshold,  the  total  queue  size  on  the  J-Machine  appears 
to  remain  bounded,  although  individual  queues  may  still  overflow.  This  approach  shows 
some  promise  for  solving  the  parallelism  control  problem. 

Finally,  a  few  changes  to  the  MDP  architecture  were  proposed.  Allowing  direct  moves  to  and 
from  ID  registers  and  providing  the  right  FIP  value  after  a  fault  would  save  instructions  in 
many  critical  Cosmos  code  sections. 

While  Cosmos  and  Optimist  form  a  workable  system  as  they  are  now,  much  fine-tuning  re¬ 
mains  to  be  done.  Due  to  a  lack  of  time,  a  few  features  of  Concurrent  Smalltalk  have  not 
been  fully  implemented.  The  door  is  now  open  for  experimenting  with  the  difficult  problems 
of  load  management,  concurrency  control,  and  garbage  collection.  These  areas  have  not  been 
studied  very  much  in  the  context  of  fine-grain  parallel  computers,  and  there  is  room  for  both 
practical  and  theoretical  results. 
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Optimist  11 

Optimist  II  is  a  second-generation  optimizing  compiler  for  Concurrent  Smalltalk,  and  the 
first  to  implement  nearly  the  entire  revised  Concurrent  Smalltalk  language.  Optimist  II 
builds  upon  Optimist  by  adding  an  interactive  prototyping  and  debugging  environment  and  a 
few  new  classes  of  optimizations.  The  introduction  of  global  optimizations  was  especially 
valuable  in  making  Concurrent  Smalltalk  easy  to  use  efficiently  and  the  runtime  system  easy 
to  write.  The  greatest  advantage  of  global  optimizations  is  that  they  permit  the  programmer 
to  divide  a  system  into  self-contained  abstractions  without  suffering  a  performance  penalty 
for  doing  so.  There  is  a  trend  in  modern  programming  languages  towards  global  optimiza¬ 
tions^,  and  Optimist  II  shows  that  they  are  both  feasible  and  desirable  for  a  language  like 
Concurrent  Smalltalk. 

Cosmos 

Cosmos  is  an  optimized  operating  system  for  the  J-Machine.  In  addition  to  performing  the 
necessary  services  to  keep  the  J-Machine  running,  it  includes  facilities  for  function  and 
method  calls;  local  and  global  object  allocation,  disposal,  and  migration;  method  lookup  ta¬ 
bles;  distributed  object  creation  and  addressing;  and  various  utilities.  A  few  interesting  pro¬ 
gramming  techniques  were  used:  an  infinite  loop  broken  by  a  fault  is  used  for  block  moves 
and  sends,  and  an  addressing  scheme  was  developed  for  distributed  objects  that  allows  easy 
addressing  of  constituents  while  at  the  same  time  distributing  them  throughout  the  J-Ma¬ 
chine  and  allowing  efficient  implementation  of  an  operation  that  returns  a  nearby  con¬ 
stituent. 

Cosmos  was  fairly  difficult  to  write  due  to  the  constant  specter  of  re-entrancy  problems  and 
double  faults.  These  errors  were  the  most  common  problems  in  JOSS  (38].  Nevertheless, 
with  the  aid  of  the  criticality  system  those  difficulties  were  overcome.  Unfortunately,  the  ca¬ 
sualty  of  this  battle  with  re-entrancy  is  ease  of  modification  of  the  Cosmos  kernel — the  kernel 
is  now  one  compact  piece  of  code.  Nevertheless,  it  should  not  be  necessary  to  make  extensive 
modifications  to  that  kernel  in  order  to  add  the  features  mentioned  in  Chapter  8. 

Debugging 

An  important  consideration  when  designing  a  complicated  computer  system  today  is  ensuring 
that  it  is  debuggable.  The  hardware  world  has  been  buzzing  with  ideas  such  as  design-for- 
test  for  a  few  years  now;  yet,  these  ideas  are  just  as  applicable  to  software.  Thus,  Cosmos  in¬ 
cludes  consistency  checks  in  strategic  locations  which  detect  common  errors  that  may  be  com¬ 
mitted  by  Concurrent  Smalltalk  programs.  However,  even  with  those  checks  debugging  a 
Concurrent  Smalltalk  in  assembly  language  is  unpleasant  and  not  as  interactive  as  it  could 
be;  for  this  reason,  Optimist  II  includes  an  interpreter  which  can  be  used  to  get  a  Concurrent 
Smalltalk  program  working  before  it  is  run  on  a  J-Machine. 

Performance  Measurements 

Performance  measurements  on  a  simulated  J-Machine  indicate  that  the  grain  size  (the  num¬ 
ber  of  instructions  executed  in  response  to  a  message)  averages  about  60  instructions.  Since 
most  functions  invocations  involve  two  messages  (tail-forwarded  invocations  being  an  impor- 


*For  example,  C++  (37)  allows  functions  to  be  declared  inline,  recommending  that  the  compiler  inline  them  in  other 
functions. 
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tant  exception),  the  average  number  of  instructions  needed  to  process  a  function  call  is  about 
100  to  120;  the  number  is  lower  if  many  tail-forwarded  invocations  are  made. 

The  MDP  executes  one  instruction  about  every  two  cycles  when  running  entirely  from  inter¬ 
nal  memory;  when  the  user  program  and  data  are  located  in  external  memory,  that  count 
only  doubles  to  about  four  cycles  per  instruction  even  though  the  external  memory  is  5  times 
slower  for  writing,  3  times  slower  for  reading,  and  about  24  times  slower  for  fetching  instruc¬ 
tions.  The  reason  for  the  unusually  low  cycles-per-instruction  number  when  the  user  pro¬ 
gram  and  data  are  located  in  external  memory  is  the  high  operating  system  overhead;  since 
the  operating  system  is  always  in  internal  memory,  running  operating  system  code  out  of  in¬ 
ternal  memory  tends  to  pull  the  cycles-per-instruction  number  down. 

Under  good  conditions  the  MDPs  can  saturate  the  network  on  J-Machines  larger  than  343 
nodes,  although  most  programs  will  not  execute  fast  enough  for  the  network  to  saturate  until 
significantly  larger  J-Machines  are  used.  To  prevent  network  saturation,  either  the  network 
will  have  to  be  made  faster,  the  program  slower,  or  some  means  of  exploiting  locality  in¬ 
vented. 

Future  Work 

Many  ideas  for  future  work  and  research  were  outlined  in  Chapter  8.  The  short-term  goals 
are  twofold:  first,  to  fill  the  remaining  holes  in  the  implementation  of  Concurrent  Smalltalk; 
in  particular,  arrays  will  be  useful  for  running  real  Concurrent  Smalltalk  applications;  sec¬ 
ond,  to  write  some  nontrivial  Concurrent  Smalltalk  programs  and  see  how  well  they  can  uti¬ 
lize  the  J-Machine’s  power.  In  addition,  the  load  management  and  parallelism  control  issues 
in  Chapter  8  should  be  explored.  So  far  development  of  the  Concurrent  Smalltalk  environ¬ 
ment  has  I  een  done  without  much  feedback  from  applications  because  until  very  recently  it 
was  not  possible  to  run  any  applications  on  even  a  simulated  J-Machine.  Now  that  the  com¬ 
piler  and  the  operating  system  are  operative,  it  will  be  possible  to  close  the  loop  and  provide 
concrete  measures  of  the  J-Machine’s  performance  on  real  problems. 

Hopes 

Optimist  II  and  Cosmos  are  but  an  early  step  in  an  evolving  base  of  software  for  the  J-Ma- 
chine.  My  hopes  are  that  the  J-Machine  will  evolve  into  a  computer  competitive  with  today’s 
fastest  computers  on  numerical  codes  and  surpassing  them  on  less-structured  but  nonethe¬ 
less  computation-intensive  Artificial  Intelligence  applications. 

When  I  originally  wrote  this  thesis  in  early  1989, 1  wrote  that  I  was  hoping  to  be  able  to  run 
a  Concurrent  Smalltalk  program  on  a  set  of  real  MDPs.  Two  years  later,  during  the  summer 
of  1991,  this  wish  came  true. 
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A.l.  Introduction 

Concurrent  Smalltalk  (CST)  is  a  concurrent  descendant  of  Smalltalk.  It  is  an  object-oriented 
programming  language  developed  for  multiple  instruction/multiple  data  concurrent  comput¬ 
ers  such  as  the  J-Machine.  It  is  an  interesting  language  for  a  message-passing  concurrent 
computer  because  it  encourages  locality  and  disciplines  the  use  of  message-passing. 

Goals 

Concurrent  Smalltalk  is  a  high-level  language  intended  for  general-purpose  programming  of 
the  J-Machine.  It  was  created  and  revised  with  the  following  goals  in  mind; 

•  Expressiveness.  Concurrent  Smalltalk  must  be  expressive  enough  to  support  the  paral¬ 
lel  programming  paradigms  we  desire  to  research  on  the  J-Machine.  In  particular,  it  must 
support  object-oriented  programming  and  fine-grained  parallelism.  Also,  since  a  large  part  of 
the  Concurrent  Smalltalk  runtime  system  is  written  in  itself.  Concurrent  Smalltalk  must 
support  higher-order  features  such  as  reasoning  about  classes  of  objects. 

•  Consistency.  Features  which  would  interact  destructively  with  other  features  were  left 
out.  For  example,  become,  although  a  useful  Smalltalk-80  construct,  would  confuse  the  type 
semantics  so  it  was  left  out. 

•  Simplicity.  Concurrent  Smalltalk  should  be  as  simple  as  possible.  In  order  to  reach  the 
goal  of  simplicity,  Concurrent  Smalltalk  should  consist  of  a  few  orthogonal  concepts.  It  is 
very  important  that  Concurrent  Smalltalk  contain  no  surprises — one  should  be  able  to  tell 
what  a  program  should  do  by  reading  it.  Features  involving  action  at  a  distance  (i.e.  having 
a  statement  invisibly  affect  another  statement  far  away)  were  intentionally  excluded. 

•  Familiarity.  Programmers  familiar  with  existing  languages  should  be  able  to  carry  over 
their  experience  to  Concurrent  Smalltalk.  Also,  corresponding  features  should  act  in  the 
same  ways,  which  reinforces  the  “no  surprises”  philosophy.  On  the  other  hand,  Concurrent 
Smalltalk  is  most  similar  to  Smalltalk-80,  Common  Lisp,  and  Scheme  in  this  respect.  Hence, 
static  scoping  is  used  for  variables. 

•  Efficiency.  It  is  important  to  be  able  to  compile  Concurrent  Smalltalk  programs  into  ef¬ 
ficient  machine  code.  An  efficient  implementation  allows  a  programmer  to  concern  himself 
primarily  with  algorithms  and  implementation  rather  than  performance  tuning.  Concurrent 
Smalltalk  is  not  a  tightly  bound  low-level  language  in  order  to  give  the  compiler  latitude  in 
optimizing  code. 

•  Commonality.  The  sets  of  built-in  classes  and  methods  presented  in  this  language  spec¬ 
ification  are  by  no  means  minimal.  However,  the  built-in  classes  are  frequently  used  and 
were  included  in  order  to  provide  a  common  base  for  Concurrent  Smalltalk  programs.  The 
inclusion  of  frequently  used  classes  has  three  advantages: 

•  The  built-ins  are  implemented  only  once,  saving  time  and  effort. 

•  The  built-ins  provide  a  consistent  functional  and  naming  specification. 

•  The  built-ins  can  be  optimized  for  efficiency. 
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Format 

BNF 

The  syntax  of  commands  is  presented  in  BNF.  Literals  are  presented  in  bold,  while  non-ter¬ 
minals  and  metasymbols  are  plain.  There  are  two  enhancements  to  the  BNF  syntax: 

The  {expr1  I  expr2  j  . . .  |  exprn}  form  specifies  that  each  expr  can  appear  at  most  once,  but  they 
can  appear  in  any  order.  The  symbol  =  expr  form  is  a  macro  used  for  readability.  It  specifles 
that  whenever  symbol  appears,  it  should  be  replaced  by  expr  before  any  productions  are  done. 

Methods  and  Functions 

The  declarations  of  methods  and  functions  are  presented  in  a  syntax  similar  to  that  used  by 
defmethod.  To  give  an  example, 

(move  whatirobot  x,  y,  z:integer  thetarfloat)  :result  Method 

declares  a  method  called  move  of  class  robot  that  takes  a  receiver  argument  what  of  class 
robot,  three  integer  arguments,  x,  y,  and  z,  and  a  float  argument,  theta.  That  method 
returns  an  object  of  class  result. 

Sometimes  an  abstract  class  like  number  is  declared  that  has  no  direct  instance  objects;  in¬ 
stead,  every  object  of  class  number  is  also  an  object  of  one  of  number’s  subclasses.  Methods 
of  an  abstract  class  may  or  may  not  have  definitions  for  that  class.  A  method  that  does  not 
have  a  definition  for  the  abstract  class  is  called  an  abstract  method.  For  example,  +  is  an  ab¬ 
stract  method  of  class  number;  there  exists  no  generic  method  to  add  two  arbitrary  numbers. 
Instead,  when  +  is  called  on  two  numbers,  the  definition  of  +  for  either  the  class  integer  or 
the  class  float  is  used.  Were  a  third  number  subclass,  complex-number,  defined,  it  would 
have  to  define  its  own  +  method.  On  the  other  hand,  the  zero?  method  of  class  number  is 
not  abstract  because  it  uses  the  =  method  (a  method  defined  on  all  numbers).  Thus,  com¬ 
plex-number  does  not  have  to  define  its  own  zero?  method. 

Abstract  methods  are  indicated  by  the  words  Abstract  Method  on  the  right  side  of  the  dec¬ 
laration  line. 

Optional  statements  are  extensions  to  the  basic  Concurrent  Smalltalk  language.  They  are 
not  guaranteed  to  be  present  in  all  implementations  of  Concurrent  Smalltalk,  but  if  an  im¬ 
plementation  supports  the  capabilities  described  by  optional  statements,  it  should  use  the  de¬ 
scribed  syntax. 
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A.2.  Syntax 

Tokens 

A  Concurrent  Smalltalk  token  is  an  arbitrarily  long  string  composed  of  the  characters  A-z,  a- 
2,  0-9,  .  <,  =,  >,  i,  and  The  characters  !,?,&,  and  8  may  not  be 

used  at  the  beginning  of  a  token,  and  a  token  may  not  be  composed  entirely  of  periods  ( . )  or 
underscores  (_).  Also,  tokens  beginning  with  an  underscore  (_)  or  a  percent  sign  (%)  are  re¬ 
served  for  system  purposes  and  macros  and  should  not  be  used  by  user  programs.  Case  is 
not  significant. 

A  token  is  considered  to  be  a  number  if  it  consists  entirely  of  the  characters  0-9,  _,  +,  -,  /,  . , 
E,  or  I;  it  contains  at  least  one  digit;  it  begins  with  +,  -,  or  a  digit;  and  it  does  not  end  with  a 
digit.  These  rules  are  borrowed  from  Common  Lisp.  E  introduces  an  exponent,  while  I  can 
be  used  for  complex  numbers  if  they  are  implemented.  Any  token  that  is  not  a  number  is  an 
identifier. 

Identifiers 

Concurrent  Smalltalk  uses  static  scoping  of  identifiers.  Local  identifiers  shadow  identical 
global  identifiers,  and  the  meaning  of  an  identifier  can  be  determined  by  its  location  in  the 
text  of  the  program.  Global  identifiers  are  introduced  by  the  following  top-level  statements 
and  their  derivatives; 

•  Defconstant,  to  define  a  constant; 

•  Defglobal,  to  define  a  global  variable; 

•  Def  class,  to  define  a  class; 

•  Def  selector,  to  define  a  method  selector; 

•  Define,  to  define  one  global  identifier  in  terms  of  another. 

The  syntactic  sugar  defmethod  expands  into,  among  other  statements,  a  def  selector,  de¬ 
fining  a  global  identifier.  Similarly,  defun  expands  into  a  defconstant  statement  that  de¬ 
fines  the  function. 

Except  for  classes,  the  above  categories  share  a  single  name  space.  Redefining  a  global  iden¬ 
tifier  causes  an  error  or  a  warning  unless  the  new  definition  is  identical  to  the  old  one.  Class 
names  have  lower  precedence  than  other  global  identifiers,  so  a  global  constant  can  shadow  a 
class  name. 

All  macros  are  global;  however,  macros  are  also  in  a  name  space  separate  from  the  one 
shared  by  the  above  categories.  Since  macros  match  patterns  instead  of  just  names,  two 
macros  may  share  the  same  name.  If  more  than  one  pattern  is  applicable,  one  is  chosen  at 
the  implementation’s  discretion.  Whenever  a  macro  is  applicable,  it  is  expanded,  unless  one 
of  the  literals  specified  in  the  macro  pattern  is  shadowed  by  a  local  identifier. 

Local  identifiers  are  introduced  by  the  following  statements  and  their  derivatives: 

•  Lambda  and  defun  introduce  the  names  of  formal  parameters. 

•  Method-Lambda  and  defmethod  introduce  self,  group,  names  of  the  instance  vari¬ 
ables,  and  names  of  the  formals.  If  a  name  conflict  occurs  between  the  formals,  instance 
variables,  self,  and  group,  the  results  are  unspecified. 

•  Let,  clet,  mv-clet,  and  mv-let  introduce  names  of  the  locals. 

•  Lambda,  method-lambda,  defun,  defmethod,  block,  and  loop  introduce  the  names  of 
continuations. 

All  of  the  shadowing  rules  are  summarized  in  Figure  A-1. 
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Language  Global  Local  ...  Local 


Figure  A-1.  Scopes  of  Identifiers 

The  scopes  of  various  kinds  of  identifiers  are  shown  above.  Except  tor  macros,  sets  of  identifiers  connected  by 
thick  lines  are  mutually  exclusive  and  may  not  contain  duplicate  names  To  find  the  meaning  attributed  to  an  iden¬ 
tifier,  follow  the  anows  from  the  bold  pattern  indicating  the  identifier's  usage  to  the  first  box  that  contains  the  identi¬ 
fier.  For  example,  if  i  is  encountered  in  a  program,  it  is  first  checked  to  be  a  local  in  the  innermost  scope,  then  a 
local  in  the  next  innermost  scope,  and  so  on  until  the  global  scope  is  reached.  If  i  is  not  a  valid  macro  pattern,  it  is 
checked  against  the  globals,  parameters,  and  constants,  and  finally  classes  On  the  other  hand,  if  » :  i  is  encoun¬ 
tered,  i  is  checked  against  the  names  of  classes  only.  « '  i  searches  only  globals,  parameters,  and  constants, 
both  user  and  predefined. 


132 


Appendix  A 


Concurrent  Smalltalk  Reference 


Identifiers  that  are  not  defined  globally*  or  in  any  enclosing  scope  are  defined  as  globals. 
They  must  be  defined  before  they  are  used.  The  exceptions  to  this  rule  are  identifiers  en¬ 
closed  in  quote  or  class  statements  listed  below. 

(global  identifier)  Primitive 

# '  identifier 

Global  returns  the  global  identifier  identifier,  which,  if  already  defined,  must  be  a  global  (not 
a  class).  If  identifier  is  not  already  defined  globally,  it  is  defined  as  a  global. 

(class  identifier)  Primitive 

# :  identifier 

Class  returns  the  global  class  class.  Since  classes  are  in  a  separate  name  space  from  other 
globals,  no  error  occurs  if  there  is  already  a  global  identifier  defined  with  the  same  name  as 
identifier. 

Symbols 

(quote  (nil  |  true  |  false  |  identifier  |  number  |  character  |  string))  Primitive 
'(nil  I  true  |  false  |  identifier  |  number  |  character  |  string) 

Symbols  can  be  specified  by  preceding  including  them  in  a  quote  form  as  above,  which  can 
be  abbreviated  by  a  quote  mark  (')•  When  presented  with  an  identifier,  the  quote  expres¬ 
sion  evaluates  to  a  symbol.  Any  valid  identifier  except  nil,  true,  and  false  can  be  used — 
symbols  cannot  be  captured  by  any  scope,  nor  can  they  be  globally  redefined.  Nil,  true,  and 
false  are  treated  specially — (quote  nil)  returns  the  null  object  nil,  while 
(quote  true)  returns  the  boolean  true  and  (quote  false)  returns  the  boolean  false, 
(quote  number),  (quote  character) ,  and  (quote  string)  just  returns  the  number  number, 
character,  or  string. 

Constants 


A  few  constants  are  predefined.  These  are  listed  in  Table  A-1  below.  In  addition,  any  num¬ 
ber  can  be  specified  by  just  including  the  number.  Characters  can  be  specified  by  preceding 
them  with  #\.  Strings  can  be  specified  by  enclosing  them  in  double  quotes  (")•  Double 
quotes  can  be  included  inside  strings  by  preceding  them  with  \. 

Table  A-1.  Predefined  Constants 


Constant 

TRUE 

FALSE 

NIL 

End-of-f lie 


Value 

True 

False 

Nil 

An  end-of  file  object 


Class 

True^ 

False^ 

Null 

Object 


*It  is  up  to  iho  implementation  to  derinc  the  meaning  of  a  global  dePinition  here.  When  a  file  is  compiled,  an  im¬ 
plementation  might  choose  to  read  all  of  the  definitions  in  the  file  and  then  compile  the  code,  or  it  could  compile  the 
file  incrementally.  In  the  latter  case  forward-referenced  identifiers  will  be  considered  undefined. 

^Sinoe  t  rue  is  a  global  conslant,  *  :  t.  rLie  has  to  be  used  to  refer  to  class  true .  Also,  class  t:  rue  is  a  subclass  of  class 
Doolean. 

^Since  false  is  a  global  constant,  N  ;  1  a .  se  hos  to  be  used  to  refer  to  class  false.  Also,  class  fa  1  se  is  a  subclass  of 
class  boolean . 
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Comments 

Comments  may  be  placed  anywhere  in  source  files.  A  comment  starts  with  a  semicolon  (;) 
and  is  terminated  by  the  end  of  the  line.  Comments  are  treated  as  if  they  were  line  breaks 
by  the  reader. 
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A.3.  Programs 

program  ::=  (top-level  1  statement)* 

A  Concurrent  Smalltalk  program  is  a  sequence  of  top-level  forms.  Additionally,  an  imple¬ 
mentation  may  allow  begin  and  if  as  top-level  forms  if  test  is  a  constant  expression  and 
statements  in  body,  consequent,  and  alternative  are  top-level  forms.  Other  statements  may 
also  be  allowed  at  the  top  level  by  extended  implementations.  Statements  at  the  top  level  are 
executed  sequentially  as  if  they  were  enclosed  in  a  begin . 

Constant  Expressions 

Constant  expressions  are  expressions  that  have  to  be  evaluated  at  compile  time.  A  constant 
expression  can  include  any  expression  or  function  call,  except  that  constant  expressions  may 
not  produce  distributed  objects  as  values  and  may  not  call  functions  that  use  futures. 

Global  Definitions 

All  constants,  parameters,  and  globals  reside  in  a  single  name  space;  in  general,  redefining 
an  identifier  with  a  different  meaning  causes  an  error.  Macros  reside  in  a  separate  name 
space  and  do  not  conflict  with  each  other  or  any  other  global  objects  (although  they  may  be 
shadowed  by  local  static  scoping). 

(def constant  name(;type]  value)  Top-level  Primitive 

Defconstant  defines  a  constan*^  named  name.  The  constant  can  be  any  valid  Concurrent 
Smalltalk  type.  If  type  is  specified,  the  value  must  have  that  type.  Once  a  constant  is  de¬ 
fined,  it  may  not  be  changed  (another  constant  is  accepted,  though,  if  it  has  the  same  value). 
Constants  encountered  in  methods  are  replaced  by  their  values  at  compile  time.  Value  must 
be  a  constant  expression.  Predefined  constants  are  listed  in  Table  A-1. 

Language  primitives  and  built-in  functions  and  selectors  are  defined  as  global  constants. 

(def parameter  name[:type]  value)  Top-level  Primitive 

Defparameter  defines  a  parameter  named  name.  The  parameter  can  have  any  valid  Con¬ 
current  Smalltalk  type.  If  type  is  specified,  the  value,  if  present,  must  have  that  type.  If  no 
type  is  specified,  type  is  assumed  to  be  ob  ject,  the  most  general  type.  Parameters  encoun¬ 
tered  in  methods  are  replaced  by  their  values  at  compile  time.  Value  must  be  a  constant  ex¬ 
pression.  Unlike  constants,  parameters  may  be  redefined  at  the  top  level,  but  their  types 
may  not  be  changed.  The  value  of  a  parameter  may  not  be  changed  by  a  running  program. 

User  functions  and  selectors  defined  using  defun,  defmethod,  and  def  selector  are  de¬ 
fined  as  parameters.  Hence,  they  may  be  redefined. 

(defglobal  name[:type]  [value])  Top-level  Primitive 

Defglobal  defines  a  global  named  name.  The  global  can  have  any  valid  Concurrent 
Smalltalk  type.  If  type  is  specified,  the  value,  if  present,  must  have  that  type.  If  no  type  is 
specified,  type  is  assumed  to  be  object,  the  most  general  type.  Value  must  be  a  constant  ex¬ 
pression. 

A  global  may  be  defined  several  times,  but  only  the  value  from  the  first  definition  is  used. 
Nevertheless,  all  definitions  of  a  global  must  have  the  same  type. 


Concurrent  Smalltalk  on  the  Message-Driven  Processor 


(define  name  name)  Top-level  Primitive 

This  primitive  defines  the  first  name  as  an  alias  for  the  object  specified  by  the  second  name. 
For  example,  if  the  second  name  refers  to  a  global,  after  this  primitive  is  executed,  both 
names  will  refer  to  the  same  global. 

(undef  name)  Top-level  Primitive 

This  primitive  removes  the  top-level  definition  of  name,  if  any.  It  should  be  used  with  cau¬ 
tion,  as  it  is  possible  to  bring  the  system  into  an  inconsistent  state  using  undef. 
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A.4.  Classes 


Built-in  Classes 

A  few  classes  are  predefined.  These  are  listed  in  Table  A-2,  and  their  hierarchy  is  shown  in 
Figure  A-2.  The  defclass  primitive  can  be  used  to  define  other  classes,  which  may  be  based 
on  the  built-in  ones  shown  in  bold  in  Figure  A-2. 

Defining  New  Classes 

(defclass  Class  {class-declaration)  superclasses  Top-ievei  Primitive 

instance-var-spec*) 
class  ::=  name 
superclasses  :;=  (class-i-) 

instance-var-spec  ;;=  typed-names  |  (typed-names  {instance-var-declaration}) 
typed-names  ::=  name  (,  name)*  [:  type] 
instance-var-declaration  =  siniine  |  £not-iniine  | 

&  reader  names  | 

fiwriter  names  I  icwriter  names  1 
&cas-er  names 
names  :;=  name  |  (name*) 

Defclass  defines  a  new  Concurrent  Smalltalk  class.  A  class  is  a  template  for  specifying  ob¬ 
jects  and  methods.  Each  object  belonging  to  the  class  contains  the  instance  variables  defined 
in  the  class  definition  as  well  as  the  instance  variables  inherited  from  its  superclasses,  if  any. 

In  the  class  definition.  Class  is  the  new  class  name.  It  is  followed  by  an  optional  declaration, 
described  later,  the  class’s  superclass  list,  and  finally  the  additional  instance  variables  de¬ 
clared  by  the  class. 

Class  Inheritance 

Each  user-defined  class  must  have  at  least  one  superclass,  but  it  may  have  more  than  one.  A 
class  inherits  the  instance  variable  and  method  definitions  from  its  superclasses.  It  may  add 
its  own  instance  variables  and  methods,  and  it  may  attempt  to  override  existing  methods.  If 
a  class  is  overriding  a  method,  the  new  method  must  be  a  subtype  of  the  existing  one. 

A  simple  form  of  multiple  inheritance  is  allowed.  Two  or  more  superclasses  may  be  specified 
for  a  class  under  the  following  conditions; 

•  There  must  be  no  instance  variable  conflicts  among  the  superclasses.  Formally,  this  re¬ 
quirement  is  satisfied  if  and  only  if  out  of  the  superclasses  Si,  83,  ...  Sn  provided  there  is  one  S, 
such  that  if  v  is  an  instance  variable  of  S,,  ISjSn,  then  v  is  an  instance  variable  of  S|  or  one  of 
its  superclasses. 

•  There  must  be  no  inherited  method  conflicts  among  the  superclasses.  Formally,  this 
means  that  if  selector  s  is  associated  with  method  m,  for  superclass  s,  and  method  mj  for  Sj, 
then  mj  and  mj  are  the  same  method  (Textual  equivalence  of  the  method  code  is  not  enough; 
mj  and  mj  must  “point”  to  the  same  method). 

The  class  then  inherits  all  of  the  instance  variables  and  all  methods  from  all  of  its  super¬ 
classes. 
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Instance  Variables 

After  the  superclasses  in  the  class  declaration  is  a  list  of  new  or  redefined  instance  variables. 
Instance  variables  without  any  type  are  given  the  type  object.  An  instance  variable  may  be 
specified  to  have  the  same  name  as  an  instance  variable  of  one  of  the  superclasses.  If  so,  the 
specified  type  must  be  a  subtype  of  the  original  instance  variable’s  type,  and  either  both  or 
neither  must  be  inline. 

An  instance  variable  may  be  declared  &  inline  or  &not -inline.  These  are  hints  to  the 
compiler  that  the  variable’s  object  should  be  placed  inline  or  on  the  heap  (not  inline).  These 
hints  only  apply  if  the  variable’s  type  is  an  inline  class.  The  compiler  is  free  to  ignore  these 
declarations. 

Reader  and  Writer  Methods 

A  few  methods  are  automatically  defined  when  a  class  C  is  defined.  For  each  instance  vari¬ 
able  X  of  c,  two  functionally  identical  methods  are  defined,  named  x  and  get-x,  that,  when 
called  on  an  object  o  of  class  c,  return  the  value  of  x  in  o.  These  methods  are  called  reader 
methods;  two  are  defined  in  order  to  avoid  name  conflicts  with  instance  variables.  Similarly, 
a  writer  method  put-x  is  defined  that,  when  called  on  an  instance  object  o  of  C  and  a  new 
value  V  of  x,  and  assigns  v  to  x  in  o  and  returns  o.  Furthermore,  a  cwriter  method  cput-x  is 
defined  that  behaves  just  like  the  writer  method  put-x  except  that  it  is  not  strict — it  does 
not  necessarily  touch  its  second  argument  v.  Finally,  acas-er  method  cap-x  is  defined  that 
performs  an  atomic  compare-and-put  operation;  (cap-x  o  comparison  replacement)  checks 
whether  the  value  of  instance  variable  x  in  O  is  eq  to  the  value  of  comparison.  If  so,  it  stores 
the  value  of  replacement  in  x  and  returns  true;  otherwise,  it  returns  false. 

If  it  is  desirable  to  produce  reader,  writer,  cwriter,  or  cas-er  methods  with  names  different 
from  the  defaults,  the  Sreader,  swriter,  scwriter,  and  &cas-er  options  can  be  used  to 
specify  the  new  names.  More  than  one  method  name  may  be  specified  for  an  instance  vari¬ 
able.  If  sreader,  Swriter,  &cwriter,or  &cas-er  is  used,  the  corresponding  default 
method  name  is  not  defined.  For  example,  if  swriter  is  used  with  an  empty  list  of  names, 
the  corresponding  writer  method  name  is  not  defined. 

Class  Definition  Options 


class-declaration  =  Sinline  |  &not-inline-de£ault  | 
filmmutable  | 
spredicate  names 

A  class  definition  allows  several  options  which  are  described  in  more  detail  below. 

A  class  may  be  declared  inline,  which  means  that,  whenever  possible,  objects  of  that  class  are 
allocated  inside  other  objects  or  in  local  variables  instead  of  on  the  heap.  &not-inline-de- 
f  ault  is  an  option  for  inline  classes. 

Objects  of  an  immutable  class  declared  with  the  &  immutable  option  may  be  shallow-copied 
at  any  time  at  the  system’s  discretion,  which  can  lead  to  significanc  performance  improve¬ 
ments.  They  are  also  often  passed  by  value  to  methods  and  functions.  It  is  not  necessary 
that  no  methods  ever  write  to  instance  variables,  but  only  that  the  effects  of  such  writes  not 
be  visible  outside  the  class  data  abstraction.  The  compiler  is  free  to  ignore  s immutable  dec¬ 
larations. 

The  spredicate  option  defines  the  name  or  names  of  the  class  predicate.  A  class  predicate 
is  a  function  that  returns  true  when  called  on  an  object  of  the  specified  class  or  its  sub¬ 
classes  and  false  on  all  other  objects.  The  default  name  of  a  class  predicate  is  obtained  by 
concatenating  a  question  mark  (?)  to  the  end  of  the  class  name,  so  (integer?  x)  tests 
whether  x  is  an  integer. 
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Inline  Classes 

When  a  class  is  declared  &  inline,  instance  objects  of  that  class  are  often  Mined — allocated 
inside  other  objects  or  local  variables.  No  method  dispatching  takes  place  on  inlined  objects 
because  the  compiler  knows  the  exact  types  of  inlined  objects — inline  class  methods  are  con¬ 
verted  to  functions.  Declaring  a  class  &inline  does  not  alter  its  semantics  except  for  a  few 
additional  restrictions  on  its  usage.  The  compiler  is  free  to  ignore  &inline  declarations. 

Subclasses  of  inline  classes  can  be  declared  under  the  following  restrictions:  A  subclass  of  an 
inline  class  may  not  declare  any  additional  instance  variables,  and  it  may  not  override  any 
methods.  The  only  superclasses  allowed  for  inline  classes  are  classes  with  no  instance  vari¬ 
ables. 

Normally  all  formals,  locals,  and  instance  variables  declared  with  inline  classes  are  inlined 
by  default.  However,  that  default  can  be  overridden  for  individual  variables  by  declaring 
them  &not-inline^  The  default  can  be  overridden  for  all  variables  by  declaring  the  class 
&not-inline-def  ault,  in  which  case  individual  variables  can  be  inlined  by  declaring  them 
fiinline  and  giving  them  the  proper  type. 

Inline  classes  are  useful  for  representing  small  objects  such  as  floats  and  locks  which  re¬ 
quire  more  than  one  word  but  for  which  ordinary  object  overhead  is  prohibitive.  In  general,  it 
is  pointless  to  declare  a  class  inline  unless  it  is  immutable  or  its  instance  objects  are  rarely 
passed  to  methods  other  than  the  inline  class’s. 


’Another  way  to  override  this  default  is  to  declare  the  variable’s  type  ns  object. 
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A.5.  Methods  and  Functions 


Introduction 

Methods  and  functions  are  the  basic  blocks  of  computation  in  Concurrent  Smalltalk.  Each 
method  and  function  can  accept  a  number  of  arguments,  which  are  assigned  to  the  formals 
for  the  duration  of  the  execution  of  the  body  of  the  method  or  function.  Furthermore,  a 
method  has  a  special  first  argument,  called  the  receiver,  which  contains  an  object  of  the 
method’s  class  on  which  the  method  was  called.  In  general,  methods  and  functions  execute 
concurrently  unless  explicitly  synchronized.  This  is  true  even  if  they  are  accessing  shared 
objects. 

Formals 

formal-spec  :;=  typed-opt-names  |  (typed-opt-names  {formal-declaration}) 
typed-opt-names  ::=  opt-name  (,  opl-name)*  [:  type} 
opt-name  ::=  name  |  _ 

formal-declaration  =  avalue  |  Cinllne  |  £not -inline  I  &no-leak  |  finame  name 

A  method’s  or  function’s  formals  are  listed  when  the  method  or  function  is  declared.  Each 
name  specifies  the  name  of  a  formal.  Typed-opt-names  specifies  one  or  more  names  sepa¬ 
rated  by  commas  followed  by  an  optional  type.  The  character  _  can  be  used  to  indicate  an 
unnamed  formal;  unnamed  formals  accept  arguments  but  cannot  be  referenced  from  within 
the  method  or  function.  If  type  is  not  present,  it  defaults  to  object.  If  the  long  form  of  a 
formal-spec  is  used,  the  formals  in  typed-opt-names  can  be  declared  using  declarations. 

Arguments  are  passed  by  value,  just  as  in  Smalltalk-80,  Scheme,  and  Common  Lisp.  The 
types  of  the  arg^uments  to  the  method  or  function  must  be  subtypes  of  the  types  of  the  corre¬ 
sponding  formals.  A  method  or  a  function  may  assign  a  value  to  a  formal,  which  only 
changes  the  method’s  or  function’s  local  value.  Of  course,  a  method  or  a  function  is  also  free 
to  mutate  a  formal  using  some  other  method;  such  changes  are  visible  to  the  outside.  This 
kind  of  mutation  corresponds  to  communication  via  shared  objects. 

A  formal  may  be  declared  &value,  which  means  that,  at  the  implementation’s  discretion,  the 
method  or  function  may  be  passed  a  shallow  copy  of  the  argument  when  it  is  called.  Thus, 
not  only  is  the  formal  passed  by  value,  but  its  first-level  structure  may  also  be  passed  by 
value.  All  formals  declared  using  an  iimmutable  class  are  automatically  declared  fivalue. 
svalue  declarations  are  especially  useful  to  improve  efficiency  of  inline  classes, 

A  formal  may  also  be  declared  iinline  or  4not -inline.  These  are  hints  to  the  compiler 
that  the  formal’s  object  should  be  placed  inline  or  on  the  heap.  These  hints  only  apply  if  the 
formal’s  type  is  an  inline  class.  The  compiler  is  free  to  ignore  these  declarations. 

Declaring  a  formal  &no-leak  is  a  hint  to  the  compiler  that  the  value  of  the  formal  is  not 
passed  out  of  the  method  or  function,  and  it  will  not  be  referenced  after  the  method  or  func¬ 
tion  returns.  Thus,  the  implementation  is  free  to  perform  a  shallow  deallocate  on  the  value 
of  the  formal  when  the  function  returns.  This  declaration  is  especially  helpful  for  arguments 
of  type  f  unct.  The  compiler  is  free  to  ignore  this  declaration. 

&name  can  be  used  to  name  an  anonymous  function  or  method.  The  name  is  saved  for  de¬ 
bugging  purposes,  sname  is  only  allowed  in  a  lambda  or  a  met  hod- lambda. 
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Return  Values 

return-specs  :;=  :  type  |  : :  (return-spec*) 

return-spec  ::=  typed-names  (  (typed-names  {return-declaration}) 

typed-names  ::=  name  (,  name)*  [:  type] 

return-declaration  s  svaiue 

A  method’s  or  function’s  return  specification  may  be  listed  when  the  method  or  function  is 
declared.  Most  methods  and  functions  return  only  one  value.  For  these  functions,  the  short 
form,  consisting  of  a  colon  ( : )  followed  by  the  return  type,  is  adequate.  If  the  return  type  is 
object,  the  entire  return  specification  can  be  omitted  altogether. 

’The  long  form  of  declaring  a  method’s  or  function’s  return  types  uses  the  double  colon  ( :  : ) 
notation  and  allows  explicit  naming  of  the  return  continuation.  The  name  is  called  a  contin¬ 
uation  name.  Continuation  names  are  lexically  scoped  and  may  be  referenced  in  the  body  of 
the  method  or  function.  The  syntax  and  semantics  of  continuation  declarations  are  analo¬ 
gous  to  those  of  formals,  and  the  continuation  names  reside  in  the  same  namespace  as  formal 
and  variable  names.  The  only  declaration  allowed  is  fivalue.  If  the  short  form  is  used,  a  de¬ 
fault  continuation  name  continuation  is  used.  Some  implementations  may  also  allow  re¬ 
turning  multiple  values.  Multiple  values  do  not  all  have  to  be  returned  at  the  same  time,  but 
all  have  to  be  returned  at  most  once  by  the  time  the  method  or  function  finishes. 

Since  the  implicit  return  statement  at  the  end  of  a  method’s  or  function’s  body  returns  its 
value  to  continuation,  it  is  an  error  to  allow  execution  to  “fall  through”  the  method  or 
function  to  the  implicit  return  statement  unless  one  of  the  continuations  is  named  contin¬ 
uation. 

Method  and  Function  Declarations 

funct-declaration  =  &non-atrict  | 

(&inline  |  &not-inline)  | 

&sicle-e££ect-£ree 

The  following  declarations  are  allowed  for  methods  and  functions: 

•  The  &non-strict  declaration  specifies  that  the  arguments  do  not  have  to  be  touched  be¬ 
fore  the  body  of  the  method  or  function  begins  executing.  Thus,  the  method  or  function  may 
at  the  compiler’s  discretion  receive  cfutures  in  the  formals.  This  declaration  is  useful  mainly 
for  inline  functions. 

•  The  &  inline  and  &not-inline  declarations  specify  that  the  method  or  function  should 
or  should  not  be  included  inline  at  the  points  where  it  is  called.  This  declaration  is  only  a 
hint,  and  the  compiler  does  not  have  to  obey  it. 

•  The  &side-ef  fect-f  ree  declaration  is  a  hint  to  the  compiler  that  the  method  or  func¬ 
tion  does  not  perform  any  visible  side  effects  on  its  arguments  or  on  the  global  environment. 
This  information  lets  the  compiler  better  schedule  calls  to  the  method  or  function.  This  di¬ 
rective  is  also  useful  on  methods  and  functions  that  dfi  perform  side  effects;  it  tells  the  com¬ 
piler  that  those  side  effects  are  not  essential.  One  example  of  a  method  that  falls  into  this 
category  is  a  method  operating  on  an  immutable  class  of  complex  numbers  that  allows  redun¬ 
dant  representations  in  rectangular  or  polar  form.  The  method  could  side  effect  a  complex 
number  to  calculate  its  polar  representation  from  its  rectangular  one,  but  that  side  effect  is 
not  essential  for  the  program  to  work  correctly. 

The  Calling  Process 

When  a  function  or  a  method  is  called,  the  values  of  the  arguments  are  computed  and  as¬ 
signed  to  the  formals.  The  formals  are  touched  unless  the  function  is  declared 
&  non- strict.  After  all  formal  values  are  evaluated,  execution  of  the  method’s  expressions 
proceeds  as  if  the  expressions  were  enclosed  in  an  implicit  block — initially  the  first  expres¬ 
sion  is  evaluated,  then  the  second  one,  and  so  forth.  The  value  of  the  implicit  block,  which  is 
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the  value  of  the  last  expression,  is  returned  to  the  caller  unless  an  exit  or  return  state¬ 
ment  is  encountered  first. 

Scoping  of  Local  Variables 

Local  variables  are  statically  scoped.  Any  lambda,  method-lambda,  future,  or  lazy-fu¬ 
ture  created  within  a  method  or  a  function  is  a  full  closure  and  may  reference  and  alter  the 
method’s  or  function’s  local  variables.  Similarly,  the  method  or  function  may  alter  its  locals, 
and  such  changes  will  be  visible  to  any  lambda,  method-lambda,  future,  or  lazy-future 
nested  within  it. 

If  concurrency  and  efficiency  are  desired,  however,  such  sharing  should  be  avoided  whenever 
possible.  A  lambda,  method-lambda,  future,  or  lazy-future  should  declare  its  own  tem¬ 
poraries  for  local  computations  instead  of  using  ones  belonging  to  an  outer  static  scope.  If  a 
method  or  function  wants  to  pass  values  into  a  closure,  it  should  initialize  the  appropriate 
temporaries  before  the  closure  is  created  and  not  change  those  temporaries  afterwards.  The 
closure  should  not  change  those  temporaries  either,  unless  it  wants  to  pass  a  result  back  to 
the  method  or  function  that  created  it. 

Functions 

(lambda  (formai-spec*)  [return-specs]  [funct-declaralion]  Primitive 

body) 

Lambda  defines  and  returns  an  anonymous  function.  Formal-spec*  is  a  list  of  the  function’s 
formals  and  their  types.  Return-specs  specifies  the  function’s  return  type,  or,  if  it  returns 
multiple  values,  the  number  of  return  values  and  their  types.  The  function  may  also  have 
declarations,  as  explained  above.  Body  is  a  list  of  statements  that  form  the  body  of  the  func¬ 
tion. 

(defun  name  (formal-spec*)  [return-specs]  [funct-declaration]  Top-level  Macro 

body) 

Defun  defines  a  global  function  with  name  name,  formals  as  specified  in  formal-spec*,  return 
values  defined  by  return-specs,  optional  declarations  fund-declaration,  and  body  body. 

Methods 

(method-lambda  Class  (formal-spec*)  [return-specs]  [funct-declaration]  Macro 

body) 

Method-lambda  returns  a  method  of  class  class.  The  resulting  method  does  not  have  a  se¬ 
lector.  Nevertheless,  it  can  be  called  as  a  function  if  the  first  argument  is  an  instance  object 
of  class.  The  other  parameters  are  as  in  lambda. 

Method-lambda  also  introduces  into  the  scope  of  body  the  names  of  the  instance  variables  of 
an  object  of  class  class  as  well  as  two  special  variables;  self  and  group.  Self  refers  to  the 
first  argument  of  the  method  call,  also  known  as  the  receiver  object.  If  class  is  a  subclass  of 
distob  j,  group  refers  to  the  group  name  of  the  distributed  object  of  which  self  is  a  con¬ 
stituent. 

(defselector  selector)  Top-level  Primitive 

selector  ::=  name 

Defselector  defines  name  as  a  selector.  This  primitive  is  rarely  used  explicitly,  as  all  un¬ 
defined  names  are  assumed  to  be  selectors  by  default. 
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(add-method  selector  class  value)  Top-level  Priaiitive 

Add-method  associates  a  method  with  its  class  and  selector.  When  selector  is  called  with  a 
receiver  object  that  belongs  to  class,  value  is  called.  Value  should  be  a  function  or  a  method. 

(method  selector  class)  Primitive 

Method  performs  the  inverse  of  the  add-method  operation — it  returns  the  method  associ¬ 
ated  with  selector  and  class.  If  no  method  is  associated  with  selector  and  class,  method  re¬ 
turns  nil. 

(defmethod  selector  class  (formal-spec*)  [return-specs] 

{funct-declaration} 
body) 

Defmethod  defines  a  global  method  with  the  given  selector  and  class, 
is  analogous  to  defun. 

When  a  method  is  called,  the  values  of  the  selector  and  arguments  are  computed,  and  the 
method  associated  with  the  selector  and  the  class  of  the  receiver  object  is  found.  Of  course, 
this  method  may  have  been  defined  for  a  superclass  of  the  class  of  the  receiver  object  (i.e.  it 
may  be  inherited).  It  is  an  error  occurs  if  no  such  method  exists.  Otherwise,  the  process  of 
calling  a  method  is  the  same  as  that  of  calling  a  function. 


Top-level  Macro 

The  rest  of  the  syntax 
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A.6.  Statements 

value  ::=  statement 
expression  ::=  statement 
body  ;>  statement* 

In  the  definitions  below,  the  non-terminals  value,  expression,  and  statement  all  refer  to 
statements,  although  value  usually  denotes  a  side-eifect-free  statement  that  is  executed  for 
its  return  value,  expression  denotes  a  statement  that  may  have  side  effects  but  is  executed 
mainly  for  its  return  value,  and  statement  denotes  a  statement  that  is  executed  mainly  for  its 
side  effects.  A  body  is  a  sequence  of  statements  executed  one  after  another  just  like  in  be¬ 
gin;  the  value  of  a  body  is  the  value  of  its  last  statement. 

Futures  and  CFutures 

Futures  and  efutures  (context  futures)  are  the  main  means  of  achieving  concurrency  in  Con¬ 
current  Smalltalk.  Both  futures  and  efutures  are  promises  to  produce  some  value  at  a  later 
time.  Forcing  a  future  means  forcing  the  future  to  fulfill  its  promise  and  return  its  value. 
Analogously,  touching  a  efuture  forces  it  to  calculate  and  return  its  value.  A  force  implies  a 
touch,  so  a  force  never  returns  a  efuture. 

There  are  two  main  differences  between  futures  and  efutures.  These  are  outlined  below: 

•  Futures  are  guaranteed  not  to  be  forced  unless  they  are  explicitly  forced,  while  efutures 
are  not  guaranteed  not  to  be  touched — they  may  be  touched  at  any  time  at  the  compiler’s  and 
operating  system’s  discretion.  In  an  extreme  case,  efutures  may  be  touched  as  soon  as  they 
are  created,  leading  to  a  sequential  implementation  of  Concurrent  Smalltalk  (except  for  fu¬ 
tures). 

•  CFutures  are  generated  by  almost  all  primitive  operations,  while  futures  are  generated 
only  by  the  future  and  lazy-future  primitives  and  their  derivatives. 

•  CFutures  are  always  eager — if  left  alone,  they  will  tend  to  evaluate  to  their  values.  Nor¬ 
mal  futures,  on  the  other  hand,  may  be  eager  or  lazy.  A  lazy  future  may  not  begin  to  evalu¬ 
ate  to  its  value  until  it  is  forced;  if  it  is  never  forced,  it  may  never  be  evaluated. 

The  rationale  behind  creating  two  kinds  of  futures  is  to  allow  the  use  of  efutures  for  most 
tasks  where  parallelism  is  desirable  but  guaranteed  parallelism  is  not  necessary  for  the  cor¬ 
rect  operation  of  the  program.  CFutures  are  intended  to  be  very  cheap — ^they  can  be  created 
and  touched  in  a  few  assembly  language  instructions.  Futures,  on  the  other  hand,  are  re¬ 
served  for  the  cases  like  normal-order  evaluation  where  the  semantics  of  delayed  evaluation 
are  necessary  for  the  program  to  run  correctly.  Futures  are  much  more  expensive  than  efu¬ 
tures  in  terms  of  space  and  time. 

Both  futures  and  efutures  may  have  values  of  complicated  expressions  as  their  promises.  For 
example,  if  (f  3)=30,  (g  7)=49,and  (h  30  4  9)  =7  9,  during  the  execution  of  the  state¬ 
ment 

(cset  a  (h  (f  3)  (g  7))) 

a  may  be  computed  in  arbitrary  order,  and  f  and  g  need  not  have  returned  values  by  the 
time  the  next  statement  is  executed.  If  a  is  later  touched,  it  will  assume  the  value  7  9. 

The  semantics  become  more  complicated  if  the  functions  f ,  g,  and  h  have  side  effects.  The 
order  of  evaluation  of  arguments  of  function  calls  is  undefined  and  may  be  parallel,  so  f  and 
g  may  be  evaluated  in  parallel.  Furthermore,  if  h  is  declared  &non-strict  (as  many  built- 
ins  are),  the  evaluation  of  h  may  overlap  with  the  evaluation  of  its  arguments.  If,  say,  h  does 
not  use  the  value  of  its  second  argument  until  late  in  its  execution,  h  may  already  be  execut¬ 
ing  while  (g  7)  is  still  being  calculated.  Finally,  if  h  can  return  without  ever  requesting  the 
value  of  its  second  argument,  (g  7)  may  never  be  completely  evaluated  (since  efutures  are 
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eager,  it  will  keep  evaluating,  but  the  entire  program  may  finish  before  it  is  done).  A  good 
example  of  this  phenomenon  is  (and  b  false) ,  where  the  program  can  proceed  without 
ever  determining  the  value  of  b. 

Argument  Evaluation 

Unless  a  method  or  a  function  is  declared  &non-strict,  method  and  function  calls  are  strict 
with  respect  to  cfutures  but  not  futures — the  arguments  of  a  method  or  function  are  guaran¬ 
teed  not  to  be  cfutures  when  the  method  or  function  begins  evaluation.  For  example,  assum¬ 
ing  no  futures  are  used,  in 

(cset  a  (h  (f  3)  (g  7)  ) ) 

(cset  b  (k  10) ) 

(touch  a) 

(cset  c  (1  10) ) 

(f  3)  and  (g  7)  are  guaranteed  to  be  done  evaluating  before  (h  30  49)  begins  evaluating.  Also, 
(f  3),  (g  7),  and  (h  30  49)  are  guaranteed  to  be  done  evaluating  before  the  evaluation  of  (1 10) 
is  started.  However,  (k  10)  can  be  evaluated  concurrently  with  any  of  (f  3),  (g  7),  (h  30  49),  or 
(1  10). 

The  arguments  of  functions  are  evaluated  concurrently.  This  means  they  may  be  evaluated 
sequentially,  in  parallel,  or  any  combination  of  the  two.  Using  side  effects  can  sometimes 
lead  to  deadlock.  For  example,  suppose  that  the  function  release-lock  releases  a  global 
lock  and  acquire-lock  waits  until  the  lock  is  released  and  then  acquires  it.  Further,  sup¬ 
pose  that  global-lock  is  originally  acquired.  Then,  the  expression 
(h  (release-lock  global-lock)  (acquire-lock  global-lock)) 
can  lead  to  deadlock  because  the  implementation  might  choose  to  evaluate  acquire-lock 
sequentially  before  release-lock. 

Concurrent  evaluation  order  is  also  distinct  from  an  arbitrary  sequential  order.  For  example, 
suppose  that  c  is  a  local  variable  with  an  initial  value  of  0  and  consider  the  value  of  the  ex¬ 
pression 

(cset  a  (+  (cset  c  (+  1  c) )  (cset  c  (+  1  c)))) 

(touch  a) 

(touch  c) 

Under  sequential  evaluation  of  arguments,  the  final  value  of  a  would  always  be  3  and  the  fi¬ 
nal  value  of  c  would  always  be  2  when  this  expresoion  completes.  Under  concurrent  evalua¬ 
tion  of  arguments,  the  final  value  of  c  could  be  1  if,  say,  both  increments  were  done  before  ei¬ 
ther  assignment  to  c.  In  this  case,  a  would  get  the  value  2. 

(touch  expression)  Primitive 

(touch  expression*)  Macro 

If  expression  is  not  a  cfuture,  touch  does  nothing.  Otherwise,  touch  waits  until  the  value  of 
the  cfuture  is  available  and  then  returns  that  value.  It  should  be  kept  in  mind  that  if  touch 
is  used  in  a  subexpression,  other  subexpressions  may  or  may  not  continue  evaluating  while 
this  touch  is  waiting.  Also,  a  touch  in  a  subexpression  does  not  guarantee  that  the  entire 
expression  will  not  yield  a  cfuture,  as  is  demonstrated  in  one  of  the  examples  above. 

If  more  than  one  expression  is  specified,  touch  touches  them  all  and  returns  the  value  of  the 
last  one.  If  no  expressions  are  specified,  touch  returns  nil. 

Touch  does  not  have  any  effect  on  futures. 
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Primitive 
Macro 
Macro 

If  expression  is  not  a  future  or  a  cfuture,  force  does  nothing.  Otherwise,  force  waits  until 
the  value  of  the  future  or  cfuture  is  available  and  then  returns  that  value.  That  value  is 
guaranteed  not  to  be  a  future  or  a  cfuture. 

If  more  than  one  expression  is  specified,  force  forces  them  all  and  returns  the  value  of  the 
last  one.  If  no  expressions  are  specified,  force  returns  nil.  The  '.expression  form  is  a 
shorthand  for  (force  expression). 

(future  expression)  Primitive 

(lazy-future  expression)  Primitive 

Future  and  lazy-future  both  return  futures  that  promise  to  evaluate  expression  when 
forced.  The  futures  are  guaranteed  to  evaluate  in  parallel  with  all  other  processes  unless 
explicitly  synchronized.  Future  and  lazy-future  differ  in  that  future  begins  evaluating 
its  expression  immediately,  while  lazy-future  waits  until  it  is  forced  before  it  starts 
evaluating  its  expression  In  any  case,  expression  is  evaluated  at  most  once,  no  matter  how 
many  times  it  is  forced. 

Caveats:  The  actual  time  when  a  future  is  forced  is  sometimes  rather  fuzzy,  especially  in 
the  presence  of  inlined  primitives  and  side-eflfect-free  functions,  so  the  guarantee  in  the  pre¬ 
vious  paragraph  may  not  apply  in  the  code  just  before  a  future  is  forced  (the  extent  of  this 
fuzzy  section  of  code  is  still  to  be  determined).  Also,  futures  should  not  return  objects  of 
classes  that  can  be  inlined — doing  this  may  force  the  future  immediately  at  any  point.  These 
caveats  should  not  present  problems  unless  futures  have  intricate  side  effect  dependencies. 

Application  Statement 

(fund  arg*)  Primitive 

fund  expression 
arg  :;=  expression 

The  first  item  of  an  application  statement  is  either  a  method  selector  or  a  function.  If  it  is  a 
selector,  the  method  corresponding  to  the  selector  and  the  class  of  the  first  argument  is  called 
using  the  arguments  provided.  If  it  is  a  function,  it  is  applied  to  the  specified  arguments. 
The  first  item  can  also  be  any  expression  that  evaluates  to  an  object  of  type  funct.  The 
value  of  the  application  statement  is  either  the  return  value  or  a  cfuture  promising  that 
value. 

The  order  of  evaluation  of  arguments  is  not  specified;  in  fact,  some  of  them  may  be  (but  are 
not  guaranteed  to  be)  evaluated  concurrently.  The  arguments  are  not  guaranteed  to  be 
touched  before  being  passed  to  the  funct — some  of  them  may  be  passed  to  the  funct  as  futures 
or  even  cfutures  (However,  all  user-defined  methods  and  functions  not  explicitly  declared 
:non-strict  will  touch  their  arguments  before  their  code  begins  executing).  For  example, 
(cset  a  (+  0  a))  does  not  touch  a,  and  (and  b  false)  does  not  touch  b. 

Type  Assertion 

(:lype  expression)  Primitive 

The  type  assertion  statement  asserts  that  the  type  of  expression’s  value  is  a  subtype  of  type. 
It  returns  expression’s  value.  The  compiler  is  not  required  to  generate  an  error  if  expression 
evaluates  to  a  value  that  is  not  a  subtype  of  type,  but  it  may  do  so. 


(force  expression) 

!  expression 
(force  expression*) 
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Variable  Bindings 

(clet  (binding-spec*)  body)  Primitiva 

binding-spec  ::=  typed-opt-names  |  (typed-opt-names  {variable-declaration)  [value]) 
typed-opt-names  ::=  opt-name  (,  opl-name)*  [;  type] 
opt-name  name  |  _ 

variable-declaration  s  tinline  |  finot-lniine 

Clet  creates  local  variable  bindings  and  evaluates  body  within  the  scopes  of  those  bindings. 
Each  name  specifies  the  name  of  a  new  variable.  Typed-opt-names  specifies  one  or  more 
names  separated  by  commas  followed  by  an  optional  type.  The  character  _  can  be  used  to  in¬ 
dicate  an  unnamed  local  variable;  unnamed  local  variables  can  be  used  to  evaluate  the  initial 
value  expression  without  binding  a  name  in  the  static  scope.  If  type  is  not  present,  it  de¬ 
faults  to  object.  If  the  long  form  of  a  binding-'.pec  is  used,  the  variables  in  typed-opt-names 
can  be  declared  using  declarations  and  can  be  given  an  initial  value.  Value,  the  initial  value 
is  an  expression  evaluated  outside  the  scope  of  the  clet.  Each  initial  value  is  evaluated  only 
once,  even  if  it  is  assigned  to  more  than  one  variable.  The  new  variables  are  bound  concur¬ 
rently.  Their  initial  values  may  be  evaluated  concurrently,  and  they  are  not  guaranteed  to 
be  touched  by  the  time  body  begins  executing — in  body  the  new  variables  may  still  contain 
cfuturea. 

A  variable  may  be  declared  &inline  or  &not-inline.  These  are  hints  to  the  compiler  that 
the  variable’s  object  should  be  placed  inline  or  on  the  heap.  These  hints  only  apply  if  the 
variable’s  type  is  an  inline  class.  The  compiler  is  free  to  ignore  these  declarations. 

The  value  returned  by  a  clet  is  the  value  returned  by  the  last  statement  in  body. 

(let  (binding-spec*)  body)  Macro 

Let  is  the  same  as  clet  except  that  all  newly-bound  variables  are  touched  before  body  be¬ 
gins  executing.  As  with  clet,  the  initial  values  are  evaluated  concurrently. 

(cset  name  expression)  Primitive 

Cset  sets  the  variable  name  to  expression.  The  variable  gets  either  the  touched  value  of  ex¬ 
pression  or  a  cfuture  promising  to  evaluate  expression.  The  value  returned  by  a  cset  is  the 
value  of  expression. 

(set  name  expression)  Macro 

Set  sets  the  variable  name  to  the  value  of  expression.  The  value  is  touched  before  it  is  as¬ 
signed  to  the  variable,  so  the  variable  will  not  contain  a  cfuture  or  a  future  after  this  state¬ 
ment.  The  value  returned  by  a  set  is  the  touched  value  of  expression. 

(cas  name  comparison  replacement)  Primitive 

comparison  ::=  expression 
replacement  ::=  expression 

CAS  (compare-and-set)  is  an  atomic*  operation  that  checks  whether  the  value  of  variable 
name  is  eq  to  the  value  of  comparison.  If  so,  the  value  of  replacement  is  stored  in  variable 
name  and  cas  returns  true;  otherwise,  cas  returns  false.  The  value  of  variable  name  is 
never  a  cfuture  when  cas  completes. 


*In  the  current  implementation,  in  order  for  cas  to  be  atomic,  neither  name  nor  replacement  can  be  a  future.  If 
replacement  could  be  a  future,  it  should  be  forced  before  a  cas  is  done.  There  is  no  easy  solution  if  name  could  be  a 
future.  Fortunately,  there  is  usually  little  reason  to  store  a  future  in  a  semaphore. 
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Multiple  Values 

The  constructs  below  are  used  for  receiving  multiple  values  from  methods  and  functions. 
Multiple  values  may  not  be  supported  by  all  implementations  of  Concurrent  Smalltalk. 

(mv-cset  (name*)  (fund  arg*);  optional  Primitive 

MV-cset  sets  the  variables  name*  to  the  multiple  values  returned  by  (fund  arg*) .  The  vari¬ 
ables  get  either  the  touched  return  values  of  expression  or  cfutures  promising  to  evaluate 
them.  Some  of  the  return  values  may  be  available  before  others.  Mv-cset  returns  nil. 

(mv-set  (name*)  (fund  arg*))  Optional  Macro 

MV-set  is  just  like  mv-cset  except  that  it  touches  all  variables  in  name*  before  continuing. 

(mv-ciet  (mv-binding-spec*)  (fund  arg*)  body)  Optional  Macro 

mv-binding-spec  ;;=  typed-opt-names  1  (typed-opt-names  {variable-declaration)) 
typed-opt-names  opt-name  {,  opt-name)*  [-.  type] 
opt-name  ;:=  name  |  _ 

variable-declaration  s  tinline  1  anot-iniine 

(mv-let  (mv-binding-spec*)  (funct  arg*)  body)  optional  Macro 

MV-clet  and  mv-let  are  just  like  clet  and  let  except  that  they  initialize  the  new  vari¬ 
ables  to  the  values  returned  by  (funct  arg* ) . 

Syntactic  Sugar 

[arg*]  Macro 

This  form  is  equivalent  to  (get  arg*) . 

(cset  (fund  arg*)  expression)  Macro 

When  the  first  argument  of  a  cset  is  a  function  or  a  method  call,  cset  is  desugared  into  an¬ 
other  function  or  a  method  call.  The  above  forms  are  converted  to  (fund'  arg*  expression), 
where  the  identifier  fund'  is  obtained  by  appending  the  characters  cput-  to  the  beginning  of 
the  identifier  fund,  unless: 

fund  is  get,  in  which  case  fund'  is  cput; 

fund  is  get-x,  in  which  case  fund'  is  cput-x  (x  is  any  sequence  of  characters); 
fund  is  put,  put-x,  cput,  cput-x,  cap,  or  cap-x,  in  which  case  an  error  occurs. 

Fund  must  be  a  function  name  or  a  method  selector.  It  may  not  be  an  expression  or  a  vari¬ 
able  reference.  (fund  arg*)  may,  however,  be  a  macro  or  contain  macros;  these  macros  are 
expanded  before  the  above  conversion  takes  place. 

For  example,  (cset  (first  sequence)  3)  is  converted  to  (cput-first  sequence  3), 
while  (cset  [big-array  7]  12)  is  converted  to  (cput  big-array  7  i2). 

(set  (fund  arg*)  expression)  Macro 

When  the  first  argument  of  a  set  is  a  function  or  a  method  call,  set  is  desugared  into  an¬ 
other  function  or  a  method  call.  The  above  forms  are  converted  to  (fund'  arg*  expression) , 
where  the  identifier  fund'  is  obtained  by  appending  the  characters  put-  to  the  beginning  of 
the  identifier  k'nd,  unless: 

fund  is  get,  in  which  case  fund'  is  put; 

fund  is  get-x,  in  which  case  fund'  is  put-x  (x  is  any  sequence  of  characters); 
fund  is  put,  put-x,  cput,  cput-x,  cap,  or  cap-x,  in  which  case  an  error  occurs. 

Fund  must  be  a  function  name  or  a  method  selector.  It  may  not  be  an  expression  or  a  vari¬ 
able  reference,  (fund  arg*)  may,  however,  be  a  macro  or  contain  macros;  these  macros  are 
expanded  before  the  above  conversion  takes  place. 
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For  example,  (set  (first  sequence)  3)  is  converted  to  (put-first  sequence  3), 
while  (set  [big-array  7)  12)  is  converted  to  (put  big-array  7  12). 

(cas  (funct  arg*)  comparison  replacement)  Macro 

When  the  first  argument  of  a  cas  is  a  function  or  a  method  call,  cas  is  desugared  into  an¬ 
other  function  or  a  method  call.  The  above  form  is  converted  to 
(fund'  arg*  comparison  replacement) ,  where  the  identifier  funct’  is  obtained  by  appending 
the  characters  cap-  (compare-and-put)  to  the  beginning  of  the  identifier  funct,  unless: 
funct  is  get,  in  which  case  funct'  is  cap; 

fund  is  get-x,  in  which  case  funct'  is  cap  ..  (x  is  any  sequence  of  characters); 
funct  is  put,  put-x,  eput,  eput-x,  cap,  or  cap-x,  in  which  case  an  error  occurs. 

Funct  must  be  a  function  name  or  a  method  selector.  It  may  not  be  an  expression  or  a  vari¬ 
able  reference.  (funct  arg*)  may,  however,  be  a  macro  or  contain  macros;  these  macros  are 
expanded  before  the  above  conversion  takes  place. 

Flow  of  Control 


(begin  body)  Primitive 

Begin  evaluates  the  statements  in  body  sequentially,  touching  each  one  except  the  last  be¬ 
fore  it  begins  the  next,  and  returns  the  untouched  value  returned  by  the  last  one.  If  there 
are  no  statements  in  body,  begin  returns  nil. 

(nconcuxrently  Statement*)  Macro 

(concurrently  Statement*)  Macro 

’Statement  Macro 

These  macros  evaluate  the  statements  in  statement*  concurrently  and  return  nil. 

Concurrently  waits  until  all  statements  have  finished  executing  before  returning,  while 

nconcurrently  does  not.  ?Statement  is  an  abbreviation  for  (nconcurrently  Statement) . 


(nparallel  Statement*)  Macro 

(parallel  Statement*)  Macro 

These  macros  evaluate  the  statements  in  statement*  in  parallel  and  return  nil.  Parallel 
waits  until  all  statements  have  finished  executing  before  returning,  while  nparallel  does 
not.  The  parallelism  is  guaranteed,  which  makes  parallel  a  much  more  expensive  state¬ 
ment  than  concurrently.  In  most  cases  concurrently  should  be  used  instead  unless  par¬ 
allel  semantics  are  explicitly  required. 


(if  test  consequent  [alternative])  Primitive 

test  ::=  expression 
consequent  :;=  expression 
alternative  ::=  expression 

If  evaluates  the  lest  expression,  which  must  return  either  true  or  false.  If  it  returns 
true,  the  consequent  expression  is  evaluated  and  its  value  returned;  otherwise,  the  alterna¬ 
tive  expression,  if  any,  is  evaluated  and  its  value  returned.  If  is  not  guaranteed  to  touch  the 
test  value.  However,  it  is  guaranteed  to  evaluate  onlv  the  appropriate  arm  of  the  condi¬ 
tional. 


Loops 

(while  test  body)  Macro 

test  :;=  expression 

While  evaluates  the  test  expression,  which  must  return  either  true  or  false.  As  long  as  it 
returns  true,  body  is  evaluated  and  test  reevaluated.  When  test  evaluates  to  false,  while 
returns  nil. 
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(repeat  body  until  test)  Macro 

test  "s  expression 

Repeat  first  evaluates  body  and  then  the  test  expression,  which  must  return  either  true  or 
false.  As  long  as  it  returns  false,  repeat  goes  back  to  evaluating  body.  When  test  evalu¬ 
ates  to  true,  repeat  returns  nil. 

Primitive  Control 

(block  continuation  body)  Macro 

Block  is  just  like  begin  except  that  it  allows  the  use  of  return  and  reply  statements  to 
leave  it.  The  statements  in  body  are  evaluated  as  in  a  begin.  Continuation  specifies  the 
block’s  continuation  for  use  in  return  and  reply  statements. 

(loop  continuation  body)  Macro 

Loop  defines  a  loop  body.  The  statements  in  body  are  evaluated  as  in  a  begin,  except  that 
eifter  the  last  statement  in  body  has  been  evaluated,  the  first  statement  is  evaluated  again, 
and  so  on.  The  loop  does  not  terminate  unless  an  explicit  return  or  reply  statement  is  en¬ 
countered.  Continuation  specifies  the  loop’s  continuation  for  use  in  return  and  reply  state¬ 
ments. 

Returning  Values 

Since  the  last  expression  in  the  method  code  is  implicitly  returned  to  continuation,  the 
statements  below  are  necessary  only  if  it  is  desired  to  return  a  value  from  the  middle  of  a 
method  or  function,  if  a  block  or  loop  should  be  terminated,  if  multiple  values  are  being  re¬ 
turned,  or  if  a  value  is  returned  to  a  continuation  with  a  name  other  than  continuation. 
Reply  and  exit  should  be  used  with  caution,  as  exit  may  cause  the  caller  to  hang,  while 
reply  may  cause  the  caller  to  crash  if  two  replies  are  inadvertently  sent.  Care  must  be  taken 
to  reply  to  each  continuation  at  most  once — sending  a  second  reply  to  a  continuation  will  al¬ 
most  certainly  cause  a  system  crash,  and  it  is  quite  difficult  to  protect  the  system  against 
this  type  of  error.  When  using  reply  it  is  important  to  remember  that  there  is  an  implicit 
reply  of  the  last  expression  in  the  method  code  to  continuation. 

Continuations 

Continuations  are  introduced  by  lambda,  method-lambda,  defun,  defmethod,  block, 
loop,  future,  lazy-future,  parallel,  and  nparallel.  The  continuations  defined  by 
future,  lazy-future,  parallel,  and  nparallel  are  not  externally  accessible.  Lambda, 
method-lambda,  defun,  and  defmethod  define  the  default  continuation  continuation  un¬ 
less  told  otherwise.  They  also  reply  to  continuation  if  allowed  to  complete  executing  with¬ 
out  an  intervening  exit.  Thus,  care  must  be  taken  when  using  nested  function  and  method 
definitions  to  make  sure  that  reply  and  return  reply  to  the  right  continuation. 

Continuation  manipulation  can  become  quite  complicated,  and  not  all  features  have  to  be 
supported  by  all  implementations.  A  minimal  implementation  only  has  to  allow  replying  to 
the  innermost  construct  that  defines  continuations;  hence,  an  implementation  may  restrict 
non-local  replies.  Furthermore,  an  implementation  does  not  have  to  support  replying  out  of  a 
future,  lazy-future,  parallel,  or  nparallel  statement,  since  these  also  introduce  con¬ 
tinuations.  A  more  sophisticated  implementation  may  allow  replies  to  all  continuations  ac¬ 
cessible  in  the  current  lexical  scope.  Finally,  an  advanced  implementation  may  choose  to 
make  continuations  first-class  values  of  class  #  :  continuation  and  allow  them  to  be  stored 
in  variables. 

(exit)  Primitive 

Exit  is  a  statement  that  hangs,  never  returning  a  value.  In  most  cases  exit  can  be  thought 
of  as  exiting  the  current  method  or  function,  but  it  does  not  necessarily  do  so  if  used  in  a 
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cset,  concurrently, nconcurrently, parallel, nparallel, block, loop,  future,  or 
lazy-future  statement,  let  or  clet  bindings,  or  some  other  statement  that  permits  paral¬ 
lel  execution  without  synchronization. 

(reply  expression)  Macro 

(reply  {continuation  expression)*)  Prindtive 

The  first  variant  of  reply  evaluates  expression  and  sends  its  value  to  continuation.  Exe¬ 
cution  then  proceeds  with  the  next  statement  of  the  current  method,  if  any.  Reply  is  not 
strict — ^it  may  reply  a  future  or  a  cfuture.  The  value  of  a  reply  statement  is  nil. 

The  second  variant  of  reply  is  used  to  return  values  to  named  continuations.  The  reply 
takes  an  even  nu  nber  of  arguments;  within  each  pair,  the  first  argument  is  the  continuation 
name  and  the  second  one  its  value. 

(return  expression)  Macro 

(return  (continuation  expression)*)  Macro 

Return  is  equivalent  to  a  reply  followed  by  an  exit — the  values  of  the  expressions  are  sent 
to  the  caller,  and  the  execution  of  the  method  or  function  terminates  subject  to  the  caveats  in 
the  exit  statement  description. 

(return-value-expected?)  :bOOlean  Function 

(return -value-expected?  continuation)  iboolean  Function 

Return-value-expected?  returns  true  if  the  caller  of  the  method  or  function  is  expecting 
a  reply  for  continuation  (or  continuation  if  continuation  is  not  specified).  It  is  not  guaraiv 
teed  to  return  false  otherwise,  so  an  implementation  that  always  returns  true  is  accept¬ 
able. 
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A.7.  Built-in  Methods  and  Functions 

Built-in  Classes 

Built-in  classes  are  provided  for  reasons  of  efficiency  and  convenience.  Many  methods  on 
built-in  classes  are  compiled  into  single  assembly  language  instructions  instead  of  method 
calls,  improving  their  speed  greatly.  Other  built-in  classes  may  be  defined  by  methods  writ¬ 
ten  in  assembly  language  and  linked  with  the  programs  generated  by  the  compiler.  Arrays 
may  be  defined  this  way.  The  built-in  classes  are  listed  in  Table  A-2,  and  their  hierarchy  is 

shown  in  Figure  A-2. 

Table  A*2.  Built-in  Classes 

Class 

Metaclass 

Values 

Null^ 

Primitive -Class 

Nil 

Symbol 

Primitive-Class 

Symbols,  including  nil,  but  not  true  and  false 

True^ 

Primitive-Class 

The  boolean  true 

False^ 

Primitive-Class 

The  boolean  false 

Boolean 

Primitive-Class 

The  booleans  true  and  false 

Character 

Primitive-Class 

ASCII  characters 

Small-Integer 

Primitive-Class 

Integers  representable  in  a  machine  word’ 

Large-Integer 

Primitive-Class 

Integers  not  representable  as  Sma  1 1  - 1  nt  ege  rs 

Integer 

Primitive-Class 

Arbitrary-sized  integers 

Float 

Primitive-Class 

Floating-point  numbers^ 

Real 

Standard-Class 

Real  numbers 

Number 

Standard-Class 

Arbitrary  numbers 

Magnitude 

Standard-Class 

Numbers,  characters,  and  booleans 

Primitive -Class 

Primitive -Cl ass 

Primitive  classes  defined  by  Concurrent  Smalltalk 

Standard-Class 

Primitive-Class 

Standard  (non-distributed)  classes 

Distributed-Class 

Primitive-Class 

Distributed  classes 

Class^ 

Primitive -Cl ass 

General  classes 

Function 

Primitive-Class 

Functions,  methods,  and  closures 

Funct 

Primitive -Class 

Functions,  methods,  closures,  and  method  selectors 

System-stream 

Primitive -Cl ass 

System-defined  streams 

Stream 

Standard-Class 

Sources  of  input  or  destinations  for  output 

Simple-Lock 

Primitive-Class 

Very  cheap  and  simple  locks 

Queueing-Lock 

Primitive-Class 

More  expensive  locks  that  queue  pending  tasks 

Lock 

Standard-Class 

General  locks 

Integer-Array 

Primitive-Class 

Small  arrays  of  integers 

String 

Primitive-Class 

Small  arrays  ol  characters 

Boolean-Array 

Primitive-Class 

Small  arrays  of  booleans 

Simple-Array 

Primitive-Class 

Small  arrays  of  arbitrary  objects 

Array 

Standard-Class 

Arrays  of  arbitrary  objects 

Collection 

Standard-Class 

Indexed  collections  ol  objects 

Distob j 

Distributed-Class  All  distributed  objects 

Object 

Standard-Class 

All  first-class  values  | 

i 

The  metaclass  of  a  class  is  the  class  of  the  class  object  itself.  Metaclasses  govern  certain  i 

aspects  of  class  behavior  such  as  inheritance  and  the  action  of  new.  Only  classes  having 

1  standard-class  or  distributed-class 

as  a  metaclass  permit  user-defined  subclasses. 

1  At  the  implementation’s  discretion  some  classes  with  primitive-class  as  a  metaclass  may 

1  ^This  class  name  conflicts  with  another  global  name,  so 

il  has  lo  be  preceded  wilh  »  :  whenever  il  is  used. 

’Currently  a  machine  word  is 

32  bits,  so  the  sr-a ;  i  - 1  n 

cqer  range  is  -2147483648  lo  2147483647. 
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Null 
Boolean • 
Character 

Number - ► 


False 

True 

•Real ' 


'•Integer  < 
‘Float 


'Small-Integer 

‘Large-Integer 


Primitive-Class 
Standard-Class 
Distributed-Class 
Function 
Selector 
System-Stream 
Simple-Lock 
Queuing-Lock 
Array  • 


Integer-Array 
Simple -Array^^^  St  ring 

^Boolean -Array 


Figure  A-2.  Hierarchy  of  built-in  classes 

The  superclasses  are  shown  to  the  left  o(  their  subclasses.  All  classes  are  subclasses  of  object.  Classes 
with  metaclass  primitive-class  are  shown  in  bold,  classes  with  metaclass  standard-class  are  shown  in 
standard  type,  and  the  one  class  with  metaclass  distributed-class  is  shown  in  italic.  User-defined  classes 
may  be  defined  as  subclasses  of  any  of  the  classes  having  standard-class  or  distributed-class  as  a 
me  ta  class 


actually  be  instances  of  standard-class  or  distributed-class,  but  portable  programs 
should  not  rely  on  this. 

Built-in  Methods 

Built-in  methods  are  provided  for  the  basic  arithmetic  and  logical  operations.  The  methods 
are  explained  in  the  following  sections.  Since  some  built-in  method  calls  compile  into  assem¬ 
bly  language  instructions,  some  restrictions  are  necessary  on  the  use  of  their  selectors. 
Specifically,  if  any  other  methods  are  defined  using  the  selectors  in  Table  A-3,  they  must 
obey  the  identities  listed  in  Table  A-4. 

Redefining  Restricted  Selectors 

If  a  rertricted  selector  is  called  with  an  argument  that  is  not  one  of  the  built-in  classes  it  rec¬ 
ognizes,  the  actual  method  for  the  class  is  found  and  executed,  possibly  after  some  of  the  iden¬ 
tities  in  Table  A-4  have  been  applied.  Thus,  it  is  possible  to  define  a  class  of  type,  say,  com¬ 
plex,  and  define  a  method  *  for  numbers  of  that  type.  That  method  will  be  called  whenever 
*  is  used  on  a  number  of  type  complex,  regardless  of  whether  that  number  is  the  first  or  sec¬ 
ond  argument.  If  both  complex  numbers  and  quaternions  are  defined,  the  complex  *  method 
should  be  prepared  to  handle  a  quaternion  as  the  second  argument,  while  the  quaternion  * 
method  should  be  prepared  to  handle  a  complex  number  as  the  second  argument.  The  re¬ 
verse  methods  have  been  added  to  handle  the  case  of  a  non-built-in  object  being  the  second 
argument  of  a  noncommutative  operation.  The  <>,<=,  and  >=  methods  should  never  be  rede¬ 
fined,  as  they  are  never  called.  Redefine  =,  >,  and  <  instead. 


^Floating  point  numbers  may  not  be  implemented  in  all  Concurrent  Smalltalk  implementations. 
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The  associative  restricted  selectors  allow  an  arbitrary  number  of  arguments;  they  compile 
into  pairwise  invocations  of  the  corresponding  methods.  The  grouping  order  is  not  specified. 

Methods  declared  with  restricted  selectors  should  not  have  side  effects. 

The  identities  in  Table  A-3  have  been  carefully  selected  to  allow  efficient  implementation  of 
primitive  operations  without  sacrificing  functionality.  Some  identities  have  been  omitted  on 
purpose.  For  example,  *  does  not  have  to  be  commutative  in  general,  nor  does  (*  a  0)  have 
to  equal  0.  Not  requiring  these  identities  allows  *  to  be  used  to  multiply  quaternions  and 
matrices. 

The  restricted  selectors  not,  and,  or,  and  xor  may  not  be  distinguishable  from  lognot,  lo- 
gand,  logor,  and  logxor  on  all  implementations.  Redefining  these  should  be  avoided;  if 
they  must  be  redefined,  only  one  set  should  be  redefined. 

Table  A-3.  Restricted  Selectors 

not  and  or  xor  lognot  logand  logor  logxor 
<<=>>==<> 

neg  +  -  reverse —  *  II  reverse-//  mod  reverse-mod 
ash  reverse-ash  integer-length 


154 


Appendix  A 


Concurrent  Smalltalk  Reference 


Table  A-4.  Identities  among  Primitive  Methods 

•  +  is  associative  and  commutative. 

•  0  is  an  identity  for  +. 

•  (-  a  b)  =  (reverse —  b  a). 

•  (-  a  b)  *  (+  a  (neg  b) ) . 

•  (neg  (neg  a) )  =  a. 

•  *  is  commutative  with  scalar  constants  and  associative. 

•  1  is  an  identity  for  *. 

•  (*  a  -1)  -  (neg  a) . 

•  (*  a  2®)  =  (ash  a  e). 

•  (//  a  (neg  b) )  =  (neg  (//  a  b) ) . 

•  (//  a  2®)  *  (ash  a  -e). 

•  (//  a  b)  =  (reverse-//  b  a). 

•  (mod  a  (neg  b) )  =  (neg  (mod  a  b) ) . 

•  (mod  a  b)  =  (reverse-mod  b  a). 

•  (ash  a  b)  =  (reverse-ash  b  a). 

•  (ash  0  a)  =0. 

•  (ash  a  0)  =  a. 

•  (not  (not  a))  =  a. 

•  and.  or.  and  xor  are  associative  and  commutative. 

•  (and  a  false)  =  false. 

•  (and  a  true)  =  a. 

•  (or  a  false)  =  a. 

•  (or  a  true)  =  true. 

•  (xor  a  false)  =  a. 

•  (xor  a  true)  =  (not  a). 

•  (lognot  (lognot  a) )  =  a. 

•  logand.  logor.  and  logxor  are  associative  and  commutative. 

•  (logand  a  0)  =  0. 

•  (logand  a  -1)  =  a. 

•  (logor  a  0)  =  a. 

•  (logor  a  -1)  =  -1. 

•  (logxor  a  0)  =  a. 

•  (logxor  a  -1)  =  (lognot  a). 

•  {<  a  b)  =  (not  (>=  a  b) ) . 

•  (>  a  b)  =  (not  (<=  a  b) ) . 

•  (=  a  b)  =  (not  (<>  a  b)). 

•  (<  a  b)  =  (>  b  a) . 

•  (<=  a  b)  =  (>=  b  a). 

•  (=  a  b)  =  (=  b  a) . 

•  {<>  a  b)  =  (<>  b  a). 
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A.8.  System  and  Object  Operations 

Objects 

(new  c:standard-class)  ;object  Method 

New,  when  applied  to  a  standard  class,  creates  and  returns  a  new  instance  object  of  the  speci¬ 
fied  class.  The  object  is  not  initialized.  Some  implementations  may  restrict  the  new  argu¬ 
ment  to  be  a  constant  expression. 

Copiers 

(deep-copy  orobject)  :object  Method 

Deep-copy  returns  a  copy  of  the  object.  Any  of  the  object’s  instance  variables  are  also  recur¬ 
sively  copied  using  deep-copy.  If  the  class  of  the  object  is  immutable,  deep-copy  may  just 
return  the  object  it  received.  Deep-copy  may  fail  to  terminate  on  circular  object  references. 

(shallow-copy  o.'Object)  .object  Method 

Shallow-copy  returns  a  copy  of  the  object  without  copying  any  of  the  object’s  instance  vari¬ 
ables.  If  the  class  of  the  object  is  immutable,  shallow-copy  may  just  return  the  object  it  re¬ 
ceived. 

(copy  o.'Object)  ;object  Method 

Copy  is  the  most  appropriate  copying  routine  for  a  given  object.  It  defaults  to  shallow- 
copy. 

Deallocators 

In  addition  to  waiting  for  garbage  collection,  the  following  methods  can  be  used  to  explicitly 
deallocate  the  storage  for  an  object.  Accessing  an  object  after  it  has  been  deallocated  causes 
an  error. 

(deep-dispose  o:object)  :null  Method 

Deep-dispose  deallocates  the  object’s  storage.  Any  of  the  object’s  instance  variables  are 
also  recursively  disposed  using  deep-dispose.  Deep-dispose  should  not  be  used  on  circu¬ 
lar  or  multiple  object  references. 

(shallow-dispose  o.'Object)  .null  Method 

Shallow-dispose  deallocates  the  object’s  storage  without  disposing  any  of  the  object’s  in¬ 
stance  variables. 

(dispose  oiobject)  ;null  Method 

Dispose  is  the  most  appropriate  deallocating  routine  for  a  given  object.  It  defaults  to  shal¬ 
low-dispose. 

Class  Inquiries 


(class-of  o:object)  :class  Method 

Class-of ,  when  applied  to  an  object,  returns  its  class. 
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(class-kind?  oiobject  cxlass)  :boolean  Method 

(class-member?  oiobject  c:class)  :boolean  Method 

Class-kind?  returns  a  boolean  value  that  specifies  whether  the  given  object  is  an  instance 
of  the  given  class  or  one  of  its  subclasses.  Class-member?  is  just  like  Class-kind?  except 
that  it  returns  true  only  if  the  object  is  a  direct  instance  of  the  given  class. 

(subclass?  cl,  c2;class)  :boolean  Method 

Subclass?  returns  true  if  cl  is  a  subclass  of  c2  and  false  otherwise. 
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A.9.  Distributed  Objects 

distobj  Class 

Distob  j  is  the  distributed  object  class. 

Group  and  Constituents 

A  distributed  object  consists  of  a  group  name  and  one  or  more  constituent  objects.  The  con¬ 
stituent  objects  act  just  like  normal  objects  except  that  they  inherit  methods  and  instance 
variables  from  the  class  distobj  and  they  respond  to  the  group  and  get -group  messages. 
A  group  name  indicates  the  entire  collection  of  distributed  objects.  When  a  method  is  called 
on  the  group  name,  it  is  processed  by  one  of  the  distobj’s  constituent  objects,  as  though  the 
method  were  called  on  that  constituent  object.  The  identity  of  the  constituent  object  receiv¬ 
ing  the  message  is  left  unspecified;  implementations  are  encouraged  to  heuristically  pick  dif¬ 
ferent  constituent  objects  for  different  calls  to  the  group,  thereby  facilitating  concurrency  for 
distributed  object  operations.  When  a  constituent  object  is  processing  a  method,  self  is  the 
constituent  object,  not  the  group  name. 

Creation 

(new  odistributed-class  n-constituents  integer)  ;distobj  Method 

New,  when  applied  to  a  distributed  class,  creates  and  returns  a  new  distributed  object  of  the 
specified  class  with  the  given  logical  number  of  constituents.  The  constituents  are  not  initial¬ 
ized. 

The  distributed  object  that  is  created  may  contain  more  constituents  than  n-constituents.  The 
runtime  system  determines  an  appropriate  physical  number  of  constituents  for  the  dis¬ 
tributed  object  that  is  at  least  as  large  as  n-constituents.  The  additional  constituents  should 
be  prepared  to  respond  to  messages  sent  to  the  distributed  object. 

Operations 

(CO  o'.distobj  ninteger)  idistobj  Method 

Co  returns  the  nth  constituent  object  of  the  distributed  object,  o  can  be  either  the  group  ob¬ 
ject  or  any  of  its  constituents.  N  must  be  between  0  and  the  physical  number  of  constituent 
objects  in  the  distobj  minus  one. 

(logical-limit  oidistobj)  integer  Method 

Logical-limit  is  the  logical  number  of  constituent  objects  in  the  distributed  object. 

(physical -limit  o .distobj)  .integer  Method 

Physical-limit  is  the  physical  number  of  constituent  objects  in  the  distributed  object.  The 
constituent  objects  are  numbered  between  0  and  physical-limit  minus  one,  inclusive. 
Physical-limit  is  never  less  than  logical-limit. 

(index  oidistobj)  integer  Method 

Index  is  the  number  of  a  particular  constituent  object  in  a  distributed  object.  Index  ranges 
between  0  and  physical -limit  minus  one,  inclusive. 
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9roup:clist0bj  instance  Variable 

(group  oidistobj)  idistobj  Method 

(get -group  o:distobi)  :distobj 

Group  is  the  inverse  of  co — it  returns  the  group  object  of  the  given  distributed  object.  O  can 
be  either  the  group  object  or  any  of  its  constituents;  if  o  is  already  a  group  object,  group  just 
returns  it.  Get -group  is  functionally  equivalent  to  group;  it  is  provided  to  avoid  name  con¬ 
flicts  with  the  group  variable  inside  distributed  object  methods. 
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Comparisons 


(eq  ol,  o2:object)  :b00lean  Function 

(neq  ol,  o2;object)  :boolean  Function 

Eq  returns  true  if  the  two  objects  are  indistinguishable — there  is  no  legal  way  of  distin¬ 
guishing  ol  from  o2.  For  mutable  objects  this  means  that  ol  and  o2  are  the  same  object. 
For  immutable  objects,  eq  may  in  addition  return  true  if  ol  and  o2  are  different  objects  that 
contain  the  same  data. 

Eq  may  return  unusual  results  for  inline  classes — an  instance  object  of  an  inline  class  is  not 
necessarily  eq  to  itself,  but  eq  will  never  return  true  on  distinguishable  objects. 

Neq  is  the  logical  negation  of  eq. 

<«  ol.  o2;object)  rboolean  Method 

(<>  ol.  o2;object)  :boolean  Method 

These  comparisons  return  true  if  ol  is  equal  to  or  not  equal  to  o2,  respectively.  Equality 
means  numeric  equality  for  numbers.  It  defaults  to  eq  or  neq  for  other  objects,  but  the  = 
method  can  be  overridden  to  specify  a  different  criterion  for  a  particular  class. 


(<  ml,  in2 magnitude)  .boolean 
(<=  ml,  m2;magnitude)  .boolean 
(>  ml,  m2  .magnitude)  .boolean 
(>«  ml,  m2:magnitude)  :boolean 


Abstract  Method 
Abstract  Method 
Abstract  Method 
Abstract  Method 


These  comparisons  return  true  if  ml  is  less  than  m2,  ml  is  less  than  or  equal  to  m2,  ml  is 
greater  than  m2,  or  ml  is  greater  than  or  equal  to  m2,  respectively.  For  the  purposes  of  com¬ 
parison,  false  is  considered  to  be  less  than  true.  It  is  an  error  to  use  <,  <=,  >,  or  >=  to  com¬ 
pare  an  object  from  one  direct  subclass  of  magnitude  with  one  of  another  direct  subclass  of 
magnitude — a  boolean  cannot  be  compared  with  an  integer. 


(max  ml,  m2  .magnitude)  .magnitude  Method 

(min  ml,  m2;magnitude)  :magnitude  Method 

Max  returns  the  greater  of  ml  and  m2,  while  min  returns  the  lesser  one.  Both  max  and  min 
use  one  of  the  comparison  operations  above  to  decide  which  is  the  greater  or  lesser,  and  the 
same  caveats  as  above  apply. 

Logical  Operations 


(not  b.'boolean)  .boolean  Method 

Not  returns  the  logical  negation  of  b. 

(and  (b:boolean)*)  :boolean  Method 

And  returns  the  logical  AND  of  its  arguments.  If  no  arguments  are  specified,  and  returns 
true. 

(or  (bboolean)*)  iboolean  Method 

Or  returns  the  logical  inclusive  OR  of  its  arguments.  If  no  arguments  are  specified,  or  re¬ 
turns  false. 
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(zor  (b.'boolean)*}  .'boolean  Method 

Xor  returns  the  logical  exclusive  OR  of  its  arguments.  If  no  arguments  are  specified,  xor 
returns  false. 

(sc-and  (b:boolean)*)  iboolean  Macro 

(sc-or  (b:boolean)‘)  iboolean  Macro 

These  are  short-circuit  versions  of  and  and  or.  They  evaluate  arguments  sequentially  from 
left  to  right  only  as  far  as  necessary  for  the  answer  to  be  unambiguously  determined. 


Arithmetic  Operations 

For  most  binary  arithmetic  operations,  the  class  of  the  result  is  the  class  of  the  most  general 
argument.  For  example,  if  two  integers  are  added,  the  result  is  an  integer,  but  if  an  in¬ 
teger  and  a  float  are  added,  the  result  is  a  float.  User-defined  classes  may  define  other 
numeric  subclasses,  in  which  case  they  have  to  handle  appropriate  coercions  themselves — ^if  a 
number  is  added  to  a  member  of  a  user-defined  subclass  of  number,  the  +  method  for  the 
user-defined  subclass  will  have  to  dispatch  on  the  type  of  its  second  argument. 

(zero?  mnumber)  iboolean  Method 

Zero?  returns  true  if  n  is  zero  and  false  otherwise. 

(neg  mnumber)  inumber  Abstract  Method 

Neg  returns  the  negation  of  n.  The  class  of  the  result  value  is  the  same  as  the  class  of  n. 

(+  (mnumber)*)  inumber  Abstract  Method 

+  returns  the  sum  of  its  arguments.  If  no  arguments  are  specified,  +  returns  0. 

(-  nl,  n2 inumber)  inumber  Abstract  Method 

-  returns  the  difference  of  its  arguments,  nl-n2. 

(*  (mnumber)*)  inumber  Abstract  Method 

*  returns  the  product  of  its  arguments.  If  no  arguments  are  specified,  *  returns  1. 

(/  nl.  n2inumber)  inumber  Abstract  Method 

/  returns  the  quotient  of  its  arguments,  nl/n2.  Ifni  and  n2  are  both  integers  and  nl  is 
not  exactly  divisible  by  n2,  the  result  is  a  float.  If  n2  is  zero,  either  an  error  occurs  or  some 
representation  of  infinity  is  substituted  as  an  answer. 

(//  nl,  n2iinteger)  .'integer  Method 

//  returns  the  integer  quotient  of  its  arguments  rounded  towards  -o°,  Lnl/n2j.  If  n2  is  zero, 
either  an  error  occurs  or  some  representation  of  infinity  is  substituted  as  an  answer.  Having 
/  /  round  towards  -<>°  allows  the  use  of  ash  to  divide  when  the  divisor  is  an  integral  power  of 
two. 


(mod  nl,  n2iinteger)  linteger  Method 

Mod  returns  the  nonnegative  remainder  of  dividing  nl  by  n2,  nl-n2*Lnl/n2j.  If  n2  is  zero,  ei¬ 
ther  an  error  occurs  or  some  representation  of  an  indeterminate  number  is  substituted  as  an 
answer.  Having  mod  return  the  nonnegative  remainder  allows  the  use  of  logand  to  find  the 
remainder  when  the  divisor  is  an  integral  power  of  two.  When  the  remainder  is  nonzero,  its 
sign  is  always  the  same  as  the  sign  of  the  divisor  n2.  Also,  (+  (mod  nl  n2)  (*  n2  (// 

nl  n2 )  )  )  =  nl. 
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(ash  ni:integer  n2;in!eger)  ;integer  :^thod 

(ash  ni  .'float  n2;jnteger)  .-float  Hfethod 

Ash  returns  nl  multiplied  by  two  raised  to  the  n2th  power,  nl*2"2.  Ifni  is  a  float,  no 
rounding  takes  place;  however,  if  nl  is  an  integer  and  n2  is  negative,  the  result  is  rounded 
towards 

(integer-length  n:integer)  ;integer  Method 

Integer-length  returns  the  bit  “size”  of  nl.  For  positive  n  this  is  riog2(n+l)l,  while  for 
negative  n  it  is  equal  to  riog2(-n)l 

Bitwise  Logical  Operations 


(lognot  biboolean)  ;boolean  Method 

(logand  (b:booleari)+)  :boolean  Method 

(logor  (b:boolean)-i )  iboolean  Method 

(logxor  (b:booiean)-(-}  :boolean  Method 

(lognot  b;integer)  linteger  Method 

(logand  (binteger)*)  linteger  Method 

(logor  (biinteger)*)  integer  Method 

(logxor  (binteger)*)  integer  Method 


These  methods  perform  bitwise  logical  operations.  When  called  on  booleans,  they  perform 
the  same  operations  as  not,  and,  or,  and  xor,  respectively.  When  called  on  integers,  they 
perform  the  corresponding  operations  bitwise  on  semi-infinite  two  s  complement  representa 
tions  of  the  integers,  treating  0  as  false  and  1  as  true.  The  integers  do  not  have  to  be  in¬ 
ternally  stored  in  the  two’s  complement  form;  all  that  is  necessary  is  that  these  operations 
act  as  if  they  were.  When  supplied  with  no  arguments,  logand  returns  -1,  while  logor  and 
logxor  return  0. 
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A.ll.  Locks 

Locks  are  used  to  synchronize  processes.  A  lock  can  be  acquired  by  only  one  process  at  a 
time,  and  the  acquiring  operation  is  atomic.  After  a  process  has  acquired  a  lock,  it  can  pro¬ 
ceed  to  perform  whatever  exclusive  operations  it  wants  to  do.  When  it  is  done,  it  should  re¬ 
lease  the  lock  to  make  it  available  again.  If  a  process  attempts  to  acquire  a  lock  that  is  busy 
(acquired),  it  will  wait  until  the  lock  is  available. 

Two  built-in  lock  classes  are  provided:  simple-lock  and  queueing-lock.  Simple-lock 
is  a  very  cheap  and  fast  implementation  intended  for  situations  in  which  a  lock  is  not  ac¬ 
quired  for  long  periods  of  time  and  there  is  little  contention  for  the  lock.  Simple-locks  are 
adequate  for  most  purposes.  Qaeueing-locks  are  heavy-duty  locks  for  use  in  situations 
where  ther*-  may  be  significant  contention  for  a  lock. 

Lock  Operations 


(new-simple -lock)  ;simple-IOCk  Function 

(new-queueing-lock)  :queueing-lock  Function 

New-simple-lock  creates  a  new  simple  lock,  while  new-queueing-lock  creates  a  new 
queueing  lock.  The  lock  is  initially  available. 

(init  l:simple-lock)  ;null 
(init  l;queueing-lock)  :null 

Init  reinitializes  the  lock,  making  it  available  regardless  of  its  previous  state. 

(acquire  l.iock)  .null  Abstract  Method 

Acquire  acquires  the  lock.  If  the  lock  is  busy,  acquire  waits  until  the  lock  is  available  be¬ 
fore  acquiring  it  and  returning. 

(release  lilock)  :null  Abstract  Method 

Release  releases  the  lock.  If  the  lock  is  already  available,  release  signals  an  error. 

(busy?  l.iock)  ;b00lean  Abstract  Method 

Busy?  returns  true  if  the  lock  is  busy  and  false  otherwise. 

(with-locks  ((l;lock)*)  body)  Macro 

Wit  h-locks  first  acquires  all  of  the  locks  listed,  in  the  order  in  which  they  are  listed,  then 
evaluates  body,  and  finally  releases  all  of  the  locks.  It  returns  the  value  of  body. 


Method 

Method 
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A.12.  Strings  and  Arrays 

Strings  and  arrays  are  the  primitive  data  structures  for  keeping  track  of  indexed  collections 
of  data.  All  primitive  strings  and  arrays  are  subclasses  of  the  class  array.  The  subclasses  of 
class  array  can  be  implemented  as  arrays,  but  implementations  are  encouraged  to  pack  in¬ 
teger-arrays,  strings,  and  boolean-arrays  to  conserve  space  and  time. 

Creating  Arrays 


(new-sinple-array  size:integer}  '.simple-array  Method 

New-simple-array  creates  a  new  simple  array  of  arbitrary  objects.  Size  specifies  the 
number  of  elements  in  the  array;  the  elements  are  numbered  0  through  size-1.  The  array’s 
elements  are  not  initialized. 

(new-integer-array  sizeiinteger  low.  highiinleger)  ;integer-array  Method 

New-integer-array  creates  a  new  array  of  integers  in  the  range  between  low  and  high, 
inclusive.  Low  must  be  less  than  or  equal  to  high.  Size  specifies  the  number  of  elements  in 
the  array;  the  elements  are  numbered  0  through  size-1.  The  array’s  elements  are  not  ini¬ 
tialized. 

(new-string  sizeiinteger)  :slring  Method 

New-string  creates  a  new  array  of  characters,  also  called  a  string,  size  specifies  the  num¬ 
ber  of  elements  in  the  array;  the  elements  are  numbered  0  through  size-1.  The  array’s  ele¬ 
ments  are  not  initialized. 

(new-boolean-artay  size.integer)  ;boolean-array  Method 

New-boolean-array  creates  a  new  array  of  booleans.  Size  specifies  the  number  of  ele¬ 
ments  in  the  array;  the  elements  are  numbered  0  through  size-1.  The  array’s  elements  are 
not  initialized. 

Operations  on  Entire  Arrays 

(fill  aiarray  value)  :array  Abstract  Method 

Fill  destructively  writes  value  to  every  element  of  the  given  array.  If  the  array  is  an  in¬ 
teger-array,  a  string,  or  a  boolean-array,  the  value  must  have  the  correct  type  and,  in 
the  case  of  integer-array,  it  must  be  in  the  range  specified  when  the  array  was  created; 
otherwise,  the  results  are  unspecified.  Fill  returns  the  updated  array. 

(init  aiarray  f  :funct)  :array  Abstract  Method 

Init  concurrently  calls  f  c.i  integers  between  0  and  the  size  of  a  minus  one,  inclusive,  and 
stores  the  results  in  the  corresponding  elements  of  a.  If  f  or  any  other  function  tries  to  read 
an  element  of  a,  it  will  wait  until  the  value  is  available.  It  is  an  error  for  f  or  any  other  func¬ 
tion  to  try  to  alter  the  values  of  elements  of  a  before  init  returns.  Init  returns  the  a  array 
after  all  calls  to  f  have  returned, 

(map  src:array  dst:array  f:funct)  :array  Abstract  Method 

Map  concurrently  calls  f  on  each  element  of  the  src  array  and  stores  the  results  in  the  corre¬ 
sponding  elements  of  the  dst  array.  The  sizes  of  the  two  arrays  must  be  equal.  If  src  is  a 
simple-array,  so  must  be  dst,  Src  and  dst  may  be  the  same  array.  If  f  or  any  other 
function  tries  to  read  an  element  of  the  dst  array,  it  will  wait  until  the  value  is  available.  It 
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is  an  error  for  f  or  any  other  function  to  try  to  alter  the  values  of  elements  of  the  dst  array 
before  map  returns.  Map  returns  the  dst  array  after  all  calls  to  f  have  returned. 

(for-each  aiarray  £:funct)  :array  Abstract  Method 

(n£or-each  a;array  £:funct}  ;array  Abstract  Method 

Both  of  the  above  methods  concurrently  call  f  on  each  element  of  the  array  and  then  return 
the  array  without  modifying  it.  Nf  or-each  does  not  wait  until  any  of  the  calls  to  f  return, 
while  for-each  does. 

Accessing  Arrays 


[a;array  pos:integer]  :object  Abstract  Method 

(get  a:array  pos:inleger)  lobject 

Get  returns  the  element  at  position  pos  of  the  given  array.  Get  signals  an  error  if  pos  is 
outside  the  bounds  of  the  array.  The  results  of  accessing  an  uninitialized  element  are  un¬ 
specified. 

(set  [aiarray  pos:integer]  value :object )  larray  Abstract  Method 

(put  a:array  posiinteger  value :object )  :array 

Put  destructively  writes  value  at  position  pos  of  the  given  array.  Value  is  not  touched. 
Put  signals  an  error  if  pos  is  outside  the  bounds  of  the  array.  If  the  array  is  an  integer- 
array,  a  string,  or  a  boolean-array,  the  value  must  have  the  correct  type  and,  in  the 
case  of  integer-array,  it  must  be  in  the  range  specified  when  the  array  was  created;  oth¬ 
erwise,  the  results  are  unspecified.  Put  returns  the  updated  a'ray. 

(size  a:array)  ;integer  Abstract  Method 

Size  returns  the  size  of  the  array,  as  specified  when  the  array  was  created. 
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A.13.  Input  and  Output 

Streams 

Streams  are  sources  and  sinks  of  data.  A  stream  is  usually  a  connection  to  a  terminal  or  to  a 
file,  but  other  uses  of  streams  are  possible.  Concurrent  Smalltalk  defines  a  general  class 
stream  as  well  as  a  specific  implementation  of  streams,  system-stream.  Other  user-de¬ 
fined  stream  classes  may  be  defined  as  subclasses  of  stream. 

Operations  on  General  Streams 

Reading 

(read-stream-char  s:stream)  lobject  Abstract  Method 

Read-stream-char  reads  a  character  from  stream  s  and  returns  it.  If  there  is  no  more  in¬ 
put  available  on  the  stream,  read-stream-char  returns  nil. 

(read-stream-line  s .Stream)  .object  Abstract  Method 

Read-stream-line  atomically  reads  a  line  from  stream  s  and  returns  it  in  the  form  of  a 
string  (without  the  trailing  line  terminator).  If  there  is  no  more  input  available  on  the 
stream,  read-stream-line  returns  nil. 

(read-stream  s;stream)  lobject  Abstract  Method 

Read-stream  reads  some  representation  of  a  Concurrent  Smalltalk  object  from  stream  s 
and  returns  it.  If  there  is  no  more  input  available  on  the  stream,  read-stream  returns  the 
constant  end-of-f  ile. 

end-of-£ile;object  Constant 

This  unique  constant  is  returned  when  read-stream-object  encounters  an  end  of  file. 

(stream-char-ready?  siStream)  ;b00lean  Abstract  Method 

Stream-char-ready?  returns  true  if  a  character  is  ready  to  be  read  from  stream  s.  It  is 
not  guaranteed  to  return  false  otherwise,  so  an  implementation  that  always  returns  true 
is  acceptable. 

Writing 

(write-stream-char  s:stream  ch:character)  .null  Abstract  Method 

Write-stream-char  writes  character  ch  onto  stream  s. 

(write-stream-string  s.Stream  string:string)  ;null  Method 

Write-stream-string  writes  string  string  onto  stream  s.  Write-stream-string  is 
equivalent  to  calling  write-stream-char  on  each  character  in  string  except  that  string 
is  written  atomically. 

(write-stream  s;stream  (oiobject)*)  :null  Method 

Write-stream  writes  some  representation  of  the  given  Concurrent  Smalltalk  objects  onto 
stream  s.  It  uses  print  to  format  objects  it  does  not  know  about.  Care  should  be  taken 
when  writing  circular  structures  to  make  sure  that  write-st  ream,  terminates. 
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(display-stream  s:stream  (oiobjecl)*)  :null  Method 

Display-stream  writes  some  representation  of  the  given  Concurrent  Smalltalk  objects  onto 
stream  s.  Strings  and  characters  are  written  literally,  without  escape  characters.  Care 
should  be  taken  when  writing  circular  structures  to  make  sure  that  display-stream  termi¬ 
nates. 

Atomicity 

(split  sistream)  istream  Abstract  Method 

Split  returns  a  new  stream  that  can  be  used  for  atomic  writing  to  s.  Anything  written  to 
the  returned  stream  is  atomically  written  onto  s  when  join  is  called  on  the  returned  stream. 

(join  siStream]  ;nulf  Abstract  Method 

Join  joins  s  back  to  a  stream  from  which  it  was  split.  It  is  an  error  to  call  join  on  a  stream 
not  returned  by  split  or  to  call  it  more  than  once  on  such  a  stream. 

Input  and  Output  Streams 


terminal  -  st  rearnisystem-stream  Global 

Terminal-stream  is  the  system-stream  used  for  interaction  with  the  terminal. 


(read-char)  :object 
(read-line)  lObject 
(read)  :object 
(char-ready?)  :boolean 


Function 

Function 

Function 

Function 


(write-char  chxharacter)  ;null 
(write-string  string'.Slring)  :null 
(write  (oiobject)*)  :null 
(display  (olobject)*)  :null 


Function 

Function 

Function 

Function 


(split -terminal)  istream  Function 

These  functions  are  the  terminal  equivalents  of  the  general  stream  methods  above. 


Formatting 


(print  o:object  s:stream)  ;null  Abstract  Method 

Print  is  used  for  formatting  arbitrary  objects  for  the  purposes  of  write-stream.  Print 
should  output  some  readable  representation  of  object  o  onto  stream  s. 

(display-print  o:object  s;stream)  ;null  Method 

Display-print  is  used  for  formatting  arbitrary  objects  for  the  purposes  of  display- 
stream.  Display-print  should  output  some  readable  representation  of  object  o  onto 
stream  s,  avoiding  escape  characters  where  possible. 
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A.14.  Macros 

Concurrent  Smalltalk  provides  a  macro  facility  which  can  be  used  to  extend  the  language.  A 
macro  consists  of  a  pattern,  an  optional  guard,  and  a  replacement.  The  pattern  can  contain 
variables  or  literals  (a  literal  is  an  identifier).  If  it  matches  with  an  expression  and  the  guard 
is  satisfied,  that  expression  is  replaced  by  the  replacement,  which  can  be  either  another  pat¬ 
tern  or  a  Common  Lisp  fiuiction. 

(defmacro  pattern  [guard]  replacement)  Top-i«val  Macro 

pattern  :;=  literal  |  ?name  |  'name  |  (pattern*  [pattern  .  pattern])  ]  gpattem 
replacement  ;>  r-pattem  |  lisp-statement 

r-pattem  ;:=  literal  |  ?name  |  iname  |  (r-pattem*  [r-pattem  .  r-pattem]) 
guard  ;>  aguard  lisp-statement 
lisp-statement  #l  lisp 

The  macro  pattern  is  a  nested  list  of  literals  and  macro  variables.  Variables  are  preceded  by 
question  marks  (?)  or  exclamation  points  ( ! ).  Question-mark  variables  can  match  identifiers, 
numbers,  and  lists,  while  exclamation-point  variables  can  only  match  identifiers.  The  dotted 
notation  at  the  end  of  a  list  indicates  that  the  rest  of  the  list  should  match  the  pattern  after 
the  dot.  When  a  pattern  is  matched  to  a  candidate  statement,  all  instances  of  the  same  vari¬ 
able  have  to  match  identical  forms.  The  pattern  can  be  as  simple  as  ?x,  which  will  match 
any  statement. 

If  an  @  symbol  precedes  a  pattern,  the  form  to  which  the  pattern  would  match  is  macro-ex¬ 
panded  before  it  is  matched  to  the  pattern.  To  avoid  infinite  loops,  @  should  not  be  the  first 
symbol  in  a  macro  pattern. 

The  guard,  if  present,  is  a  Common  Lisp  statement  that  returns  a  boolean  value.  If  the  value 
returned  is  true,  the  macro  replacement  is  substituted  for  the  pattern;  if  not,  the  macro  is 
not  expanded.  The  values  of  the  ?  and  !  variables  are  bound  in  a  Common  Lisp  scope  just 
outside  the  statement,  so  the  Common  Lisp  statements  can  refer  to  the  matched  values  of  the 
variables  just  by  referring  to  the  correct  variable  names  (including  the  leading  ?  or  ! ). 

Replacement  can  be  either  another  pattern  or  another  block  of  Common  Lisp  statements.  If 
replacement  is  a  pattern,  the  values  of  the  macro  variables  are  substituted  in  it,  and  the  re¬ 
placement  pattern  replaces  the  original  pattern  in  the  code.  If  replacement  is  a  Common  Lisp 
statement,  it  is  expected  to  return  a  list  which  replaces  the  original  pattern  in  the  code.  As 
in  the  case  of  a  guard,  the  Common  Lisp  statement  has  access  to  the  matched  values  of  the 
macro  variables. 

The  macro  replacement  pattern  can  be  another  macro.  Macros  are  expanded  until  the  result¬ 
ing  form  does  not  satisfy  any  of  the  existing  macro  patterns  and  guards.  When  several 
macros  match  a  form,  the  form  is  expanded  using  the  macro  that  was  most  recently  defined. 
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A.15.  Environment 

Errors 

(error  (msgiobject)*)  Function 

Error  signals  a  run-time  error.  The  arguments,  if  any,  should  contain  descriptive  informa¬ 
tion  about  the  error.  The  interpretation  of  the  arguments’  values  is  implementation-depen- 
dent. 

(halt)  Function 

Halt  halts  execution  of  the  current  program  due  to  a  run-time  error.  Debugging  information 
about  the  function  or  method  in  which  the  halt  took  place  may  be  printed. 

Utilities 

(include  "file-name")  Top-level  Primitive 

Include  reads  the  definitions  in  the  file  named  file-name,  as  if  that  file  were  included  in 
place  of  the  include  primitive. 

Options 

(pragma  ...)  Top-level  Primitive 

Pragma  is  a  general  compiler  declaration  and  can  contain  any  implementation-dependent  in¬ 
formation. 

(declare  option  value)  Top-level  Primitive 

Declare  sets  the  compiler  option  named  option  to  the  value  specified.  Value  must  be  a  legal 
value  for  the  option;  most  compiler  options  are  booleans,  and  for  these  value  must  evaluate  to 
either  true  or  false.  Value  must  be  a  constant  expression. 

(option  option)  Primitive 

Option  returns  the  compile-time  value  of  the  specified  compiler  option. 


169 


Appendix  B.  Using  Optimist  II 

This  appendix  describes  the  procedure  for  using  the  Optimist  II  compiler  on  a  Macintosh  II  to 
compile  Concurrent  Smalltalk  programs.  In  addition,  a  few  helpful  non-standard  Concurrent 
Smalltalk  features  implemented  by  Optimist  II  are  described. 

Starting  the  Compiler 

To  start  the  compiler,  load  the  image  containing  the  compiler  and  the  Common  Lisp  envi¬ 
ronment.  If  such  an  image  is  not  available,  load  Common  Lisp,  PCL,  the  Loop  macro,  and 
the  Optimist.Usp  file.  Execute  the  (optimist :  compile-optimist)  command  to  compile 
and  load  the  compiler,  or,  if  it  was  already  compiled,  use  (optimist :  load-optimist)  to 
load  the  compiler. 

The  compiler  provides  only  one  useful  external  Lisp  function.  It  is  (interactive-cst ) . 
Typing  (interactive-cst)  will  enter  an  interactive  Concurrent  Smalltalk  listener  loop. 

Top-Level  Commands 

Utility  Commands 

(begin  body)  Top-Level  Primitive 

Due  to  constraints  in  the  compiler,  a  select  few  forms  such  as  include  and  def  class  (but 
not  all  of  the  top-level  primitives;  most  of  the  primitives  listed  as  top  level  really  only  re¬ 
quire  that  they  not  be  included  in  any  function  or  method)  must  be  present  at  the  top  level. 
However,  sometimes  it  is  desirable  to  emit  sequences  of  those  directives  as  results  of  macros; 
to  allow  this,  a  special  form  of  begin  was  provided.  If  begin  appears  at  the  top  level,  every 
form  inside  it  is  also  evaluated  at  the  top  level. 

(set  name  expression)  Top-Level  Macro 

Set  normally  sets  the  variable  name  to  the  value  of  expression.  However,  if  it  is  placed  at 
the  top  level,  it  is  also  allowed  to  create  a  new  global  variable  name  if  one  does  not  exist  al¬ 
ready.  Thus,  at  the  top  level,  set  acts  as  either  set  or  def  global,  depending  on  whether 
the  global  variable  name  already  exists. 

(include)  Top-Level  Primitive 

Include,  when  passed  no  file  argument,  will  let  the  user  interactively  choose  a  text  file  and 
then  include  it.  This  feature  is  only  available  on  the  Macintosh  version  of  Optimist  II. 

Viewing  Objects 

While  the  listener  loop  is  active,  any  Concurrent  Smalltalk  command  will  be  immediately 
evaluated,  and  the  results  displayed  in  the  listener  window.  The  resulting  object  may  be 
displayed  in  a  somewhat  strange  syntax;  for  example,  integers  may  be  displayed  as 
#<Integer  5>,  and  booleans  as  #<True>  or  #<False>.  The  following  commands  may  be 
used  to  show  the  internal  structure  of  objects: 

(show  o:object)  :objecl  Top-Level  Primitive 

Show  shows  as  a  side  effect  the  Optimist  II  internal  representation  of  an  object.  If  the  object 
is  a  function,  its  hcodes  are  shown;  if  the  object  is  a  complex  object,  some  of  its  structure  may 
be  shown.  The  output  is  controlled  by  the  CLOS  show  generic  function.  The  value  of  the 
show  directive  is  the  object  itself,  so  the  object  is  usually  printed  normally  after  it  is  shown. 
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Please  note  that  the  hcodes  shown  for  a  function  are  only  an  approximation  of  the  actual 
hcode  data  structure  used  internally  to  represent  the  function.  Some  of  the  more  esoteric 
fields  are  not  shown,  and  sometimes  a  function  may  have  two  variables  with  the  same  name, 
which  leads  to  confusing  output.  Variable  names  were  included  for  human  readability  only; 
Optimist  II  does  not  use  them  internally.  It  is  able  to  keep  the  variables  distinct  regardless 
of  their  names.  Also,  since  a  digraph  is  a  nonlinear  structure,  pseudo-hcodes  such  as  jump 
labels  in  conditionals  and  jump,  label,  and  break  hcodes  are  inserted  into  the  output  to  make 
it  readable. 

(describe  olObject)  :object  Top-Level  Primitive 

Describe  describes  as  a  side  effect  the  Optimist  II  internal  representation  of  an  object.  It  is 
just  like  show  except  that  the  information  displayed  is  longer  and  more  detailed. 

(show-hcode  f  function  (#Llisp-function-name])  function  Top-Level  Primitive 

Show-hcode  calls  the  Optimizer’s  non-MDP-specific  optimizations  to  optimize  the  function 
and  shows  the  resulting  hcode.  Show-hcode  may  invoke  global  optimizations  and  try  to  in¬ 
line  the  functions  called  by  f,  so  this  directive  may  take  some  time  to  execute.  When  the 
progress  option  is  true  (the  default),  progress  information  is  displayed  in  the  listener  win¬ 
dow  while  this  directive  is  executing.  Detailed  progress  can  be  obtained  by  setting  the  de¬ 
tailed-progress  option.  Show-hcode  performs  no  side  effects  on  the  Concurrent 
Smalltalk  environment,  and  it  does  not  do  a  treewalk  of  the  Concurrent  Smalltalk  program. 
Show-hcode  returns  f  as  its  result. 

If  lisp-function-name  is  provided,  instead  of  showing  the  optimized  hcode,  show-hcode  calls 
the  Lisp  function  lisp-function-name  with  the  optimized  hcode  as  an  argument.  Describe-din- 
Odes  is  a  useful  Lisp  function  that  will  describe  the  compiled  hcode  in  a  little  more  detail. 

Show-hcode  will  not  optimize  a  selector.  If  viewing  optimized  method  code  is  desired,  the 
method  must  be  extracted  explicitly  using  the  Concurrent  Smalltalk  method  primitive. 

(show-mdp-hcode  f  function  (#Llisp-function-name])  .function  Top-Level  Primitive 

Show-mdp-hcode  is  just  like  show-hcode  except  that  it  also  performs  the  MDP-specific 
hcode  optimizations. 

(show-asm  f  function  [#Llisp-function-name))  function  Top-Level  Primitive 

Show-asm  compiles  the  function  f  all  the  way  to  assembly  code  and  prints  the  resulting 
MDPSim-compatible  text.  If  lisp-function-name  is  supplied,  it  is  assumed  to  be  a  lisp  function 
and  called  with  the  assembly  language  module  as  its  only  argument. 

Compiling  Programs 

(conpile  f  :object  ("OUtput-file-name"))  .object  Top-Level  Primitive 

Compile  compiles  and  treewalks  the  Concurrent  Smalltalk  data  structures  starting  with  f  as 
a  root.  Normally  f  is  a  function,  in  which  case  it  is  compiled  to  assembly  language  along  with 
any  other  functions  that  it  might  need.  If  output-file-name  is  specified,  the  MDPSim  file  is 
written  to  a  new  file  named  output-file-name;  otherwise,  the  output  is  sent  to  the  listener. 
When  the  progress  option  is  true  (the  default),  progress  information  is  displayed  in  the  lis¬ 
tener  window  while  this  directive  is  executing.  Detailed  progress  can  be  obtained  by  setting 
the  detailed-progress  option. 

Options 

As  described  in  Section  A. 15,  Concurrent  Smalltalk  options  can  be  set  using  the  declare 
Concurrent  Smalltalk  primitive  and  examined  using  the  option  primitive.  The  options  cur¬ 
rently  provided  by  Optimist  II  are  listed  in  Table  B-1. 
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Option 

Default 

Action 

n-nodes 

4 

Define  the  number  of  nodes  of  a  simulated  J-Machine.  This 
option  only  affects  Optimist’s  internal  interpreter;  the  com¬ 
piled  code  is  generic  and  will  work  on  a  J-Machine  of  any 
size  (as  long  as  the  dimensions  are  powers  of  two). 

precise 

false 

Inhibit  optimizations  that  would  affect  the  semantics  of  fu¬ 
tures  and  lazy-futures  in  a  few  esoteric  cases.  If  following 
precise  Concurrent  Smalltalk  semantics  is  not  important, 
disabling  this  option  can  produce  significant  performance 
improvements. 

delete-dead-defs 

true 

Remove  assignments  to  variables  that  will  not  be  used 
again. 

delete-moves 

true 

Tiy  to  remove  unnecessary  move  statements. 

delete-touches 

true 

'Try  to  remove  unnecessary  touch  statements. 

vflow-optimizations 

true 

Calculate  dataflow  information  and  use  it  to  perform  a  vari¬ 
ety  of  optimizations  such  as  changing  x<— y=0,  branch  if  x 
false  sequences  to  BNE  instructions. 

fold-constants 

true 

Fold  constants.  For  example,  replace  1+2  by  3.  Also  remove 
conditional  branches  when  it  can  be  determined  that  the 
condition  is  always  true  or  always  false. 

fold-global- 

constants 

true 

Fold  constants  globally.  For  example,  replace  a  call  through 
a  selector  with  a  call  of  the  method  when  the  method  can  be 
determined  using  type  analysis.  This  option  is  relevant  only 
when  fold-constants  is  true. 

forward-tails 

true 

Enable  the  altering  of  application  hcodes  immediately  fol¬ 
lowed  by  returns  into  tail-forwarded  applications  which  al¬ 
low  the  process  to  be  deallocated  and  the  answer  directly 
forwarded  to  the  caller.  This  is  the  equivalent  of  tail  recur- 

merge-code 

true 

Sion. 

Merge  common  pieces  of  code  wherever  possible. 

inline 

true 

Inline  small  functions. 

inline-size-cutoff 

12 

Set  the  size  cutoff  for  automatically  deciding  whether  to  in¬ 
line  a  function.  Increasing  this  number  causes  larger  func¬ 
tions  to  be  inlined. 

optimize-built-ins 

true 

Perform  local  built-in  optimizations  such  as  changing  mul¬ 
tiplications  to  shifts. 

compact-vars 

true 

Compact  variables  in  the  context  to  use  as  few  slots  as  pos¬ 
sible. 

reg-variables 

true 

Assign  variables  to  registers  whenever  possible. 

Iru-register- 

allocation 

true 

Use  the  least-recently-used  algorithm  to  allocate  temporary 
registers  during  code  generation. 

frame-touches 

true 

Accumulate  information  about  which  variables  are  touched 
and  optimize  touches  when  the  variables  are  known  to  be 
touched. 

frame-regs 

true 

Keep  track  of  variables  in  the  registers  during  code  genera¬ 
tion  and  use  values  from  the  registers  instead  of  from  mem¬ 
ory  whenever  possible. 

frame-migrate 

true 

Keep  track  of  whether  it  is  possible  for  the  instance  object  to 
have  migrated  away.  Don’t  force  it  if  it  could  not  have  mi¬ 
grated  away. 

lazy-ivar-access 

true 

Don’t  XLATE  the  instance  object  if  there  are  no  references 
to  it. 

Don’t  allocate  a  context  unless  it  is  actually  used. 

lazy-contexts 

t  rue 

fast-contexts 

true 

Use  fast  contexts  whenever  possible. 
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optimize-send-self 

fast-apply 

compact-sends 

compact- DCs 
delete-locals 


warn-free-refer- 

ences 

progress 

detailed-progress 

permanent- 

definitions 


print-pc 

lisp-break 


true  Send  message  to  the  current  node  if  the  receiver  is  self  and 
it  is  not  atomic. 

true  Use  ApplyFunction  and  ApplySelector  instead  of  Apply 
whenever  possible. 

true  Try  to  combine  SENDS  and  SENDEs  into  SEND2s  and 
SEND2ES. 

true  Try  to  align  DCs  on  word  boundaries  whenever  possible. 

true  Delete  local  variables  in  an  intermediate  stage  of  the  compi¬ 

lation.  This  makes  no  difference  in  the  final  output,  but 
makes  the  hcode  look  prettier  and  may  speed  up  code  gen¬ 
eration. 

false  Emit  a  warning  every  time  a  free  reference  is  found  in  a 
method  or  function. 

true  Print  progress  reports. 

false  Print  very  long  progress  reports. 

false  Use  defconstant  instead  of  defparameter  when  compil¬ 
ing  function  and  method  definitions.  When  this  option  is  set, 
a  warning  is  emitted  every  time  a  free  reference  is  found  in 
a  method  or  function  regardless  of  the  setting  of  warn-free- 
references. 

true  Print  program  counter  values  as  comments  in  output. 

true  Enter  a  Lisp  break  loop  upon  a  Concurrent  Smalltalk  warn¬ 
ing  or  error. 
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Loading  Cosmos 

To  use  Cosmos,  launch  MDPSim  using  the  Cosmos.m  file  as  an  argument.  You  may  also 
wish  to  specify  the  J-Machine’s  dimensions  as  arguments  to  MDPSim.  Use -x  x  -y  y  -z 
z,  where  x,y,  and  z  are  integers;  they  should  be  powers  of  two.  To  avoid  using  too  much 
memory,  you  may  wish  to  allocate  less  memory  per  MDP  with  the  -maize  mem  option. 

When  Cosmos.m  is  assembled  by  MDPSim,  it  will  automatically  load  the  operating  system 
onto  the  MDPs  emd  initialize  the  MDPs.  This  process  may  take  anywhere  from  a  few  seconds 
to  a  few  minutes  depending  on  how  many  MDPs  are  present  and  the  speed  of  the  host  com¬ 
puter. 

Loading  User  Programs 

Once  Cosmos  is  ready,  a  user  program  compiled  by  Optimist  II  can  be  loaded.  Use  the 
MDPSim  INCLUDE  command  to  load  the  program  generated  by  Optimist  II.  Keep  in  mind 
that  Cosmos  puts  MDPSim  into  the  case-sensitive  mode,  so  the  case  of  identifiers  and  com¬ 
mands  matters;  MDPSim  recognizes  commands  which  are  either  all  upper  case  or  all  lower 
case  characters. 

MESSAGE  fib4 
MSG; msgApply | 5 
{fFib) 

4 

lONODE 

0 

END 

Figure  C-1.  An  Injected  Application  Message 

This  message  calls  the  (Fib  function  with  the  argument  4.  The  message  itself  can  be  injected  by  executing  the 
command  inject  f  ib4.  The  5  is  the  length  of  the  message.  (  fFib)  is  Optimist  M  s  output  name  for  the  func¬ 
tion  to  be  called  (see  the  Optimist  II  output  file  if  you  are  unsure  about  the  name),  4  is  the  argument,  and  ICNOCE 
and  0  are  magic  numbers  that  cause  the  Reply  message  to  be  printed  by  MDPSim.  More  than  one  one  argument 
can  be  specified,  as  long  as  the  length  of  the  message  (the  5)  is  increased  appropriately. 

Once  the  user  program  has  been  loaded,  it  is  a  good  idea  to  build  a  few  templates  for  mes¬ 
sages  to  be  injected  into  the  program.  An  application  message  should  have  the  format  shown 
in  Figure  C-1.  If  the  messages  will  be  used  for  several  sessions,  it  might  be  appropriate  to 
put  them  into  a  file  and  include  that  file.  Application  messages  should  never  be  injected 
before  the  program  is  installed. 

Instead  of  issuing  the  include  commands  manually,  you  can  also  specify  the  files  on 
MDPSim’s  command  line,  as  was  done  in  the  example  in  Figure  5-14. 

Running  Programs 

To  run  a  program,  execute  tl^e  inject  command  on  the  message  on  which  the  program 
should  be  called  and  then  run  the  program.  Remember  to  specify  the  processor  onto  which 
INJECT  should  inject  the  application  message;  otherwise,  inject  will  inject  a  copy  of  the 
message  to  every  processor,  and  as  many  copies  of  the  program  will  execute  simultaneously 
as  there  are  processors  in  the  simulated  system. 
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MDPSim  allows  statistics  to  be  gathered  about  programs  which  are  executed  on  it.  If  the 
statistics  should  only  include  data  about  the  running  program,  they  should  be  reset  after  the 
program  is  downloaded  and  before  it  is  run.  See  the  current  MDPSim  manual  [25]  and 
Figure  5-14  of  this  document  for  more  details. 

When  you  finish  the  desired  program  runs,  use  the  QUIT  command  to  exit  the  simulator  and 
the  quit  menu  item  to  exit  MPW.  In  an  emergency,  command-shifi-period  can  be  used  to 
abort  MDPSim;  command-period  aborts  the  running  MDP  program  and  returns  to  MDPSim’s 
command  line  (use  control-C  on  UNIX  machines). 
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Appendix  D.  MDP  Architecture  Summary 

This  appendix  is  a  summary  of  the  current  version  of  the  MDP  architecture.  A  slightly  obso¬ 
lete  full  version  of  the  architecture  can  be  found  in  [16].  Many  details  have  been  simplified 
in  order  to  keep  this  Appendix  to  a  reasonable  length. 

Introduction 


The  Message-Driven  Processor  is  a  processing  node  for  the  J-Machine,  a  message-passing 
concurrent  computer.  The  MDP  is  designed  to  provide  support  for  fine-grained  concurrent 
computation.  Towards  this  goal  the  processor  includes  hardware  for  message  queueing,  low- 
latency  message  dispatching,  and  message  sending.  The  same  chip  also  contains  a  network 
interface  and  a  router  to  allow  the  routing  of  messages  throughout  the  network  without  any 
processor  intervention. 

The  size  of  the  MDFs  register  set  is  limited  to  minimize  context-switching  time.  Much  of  the 
memory  is  on  the  chip  to  improve  performance  and  reduce  the  chip’s  pin  count  and  the  chip 
count  for  the  concurrent  computer.  Having  memory  on  chip  allows  more  flexibility  in  the  use 
of  memory  than  in  designs  with  off-chip  memory.  For  example,  a  portion  of  memory  may  be 
designated  as  a  two-way  set-associative  cache  to  be  used  by  the  xlate  instruction.  Never¬ 
theless,  since  current  technological  limitations  restrict  the  size  of  the  on-chip  memory  to 
about  4096  words,  an  external  memory  interface  has  been  provided  to  allow  access  to  slow, 
off-chip  DRAM. 

The  MDP  is  also  designed  to  efficiently  support  object-oriented  programming.  Every  MDP 
word  consists  of  32  data  bits  and  a  4  bit  tag  that  classifies  the  word  as  an  integer,  boolean, 
address,  instruction,  pointer,  or  other  data.  The  MDFs  four  address  registers  include  both 
base  addresses  and  lengths,  so  all  memory  accesses  are  bounds  checked.  Normally  the  ad¬ 
dress  registers  point  to  objects,  so,  since  absolute  memory  addressing  is  not  allowed  except  by 
the  operating  system,  memory  references  can  only  be  made  to  objects  relative  to  their  begin¬ 
nings.  Having  tags  and  no  absolute  references  permits  the  use  of  garbage  collection  and 
transparent  migration  of  objects  to  other  MDP  nodes  on  the  network. 

The  MDP  is  almost  completely  message-driven.  It  is  controlled  by  the  messages  arriving 
from  the  network  that  are  automatically  queued  and  processed.  TTiere  are  two  priority  levels 
to  allow  urgent  messages  to  interrupt  normal  processing.  There  is  also  limited  support  for  a 
background  mode  of  execution  when  no  messages  are  waiting  in  the  queues. 

Processor  State 


The  processor  state  of  the  MDP  is  kept  in  a  set  of  registers  shown  in  Figure  D-1.  There  are 
three  independent  copies  of  most  registers  for  each  of  the  two  priorities  of  the  MDP,  allowing 
easy  priority  switches  while  keeping  the  integrity  of  the  registers.  The  registers  are  symboli¬ 
cally  represented  as  follows; 


R0-R3 

A0-A3 

ID0-ID3 

Q.  M,  U,  F,  I,  P,  B 

IP 

FIR 

FIP 

FOPO,  FOPl 

QBM 

QHL 


general-purpose  data  registers 
address  registers 
ID  registers 
flags 

instruction  pointer  register 
faulted  instruction  register 
faulted  instruction  pointer  register 
faulted  operand  registers 
queue  base/limit  register 
queue  head/tail  register 
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TBM  translation  baso/masK  register 

NNR  node  number  register 

MAR  memory  address  register 


The  Q  flag  controls  message  queue  access  through  register  A3,  while  the  M  flag  guards  against 
inter-priority  message  deadlocks.  Setting  the  U  {unchecked  mode)  flag  disables  type  and 
)verflow  faults.  Setting  the  F  (faulted)  flag  vectors  all  faults  to  the  CATASTROPHE  vector;  this 
flag  is  often  set  in  critical  sections  of  fault  handlers.  Setting  the  I  (interrupt)  flag  prevents 
higher-priority  interrupts.  The  B  and  P  flags  encode  the  current  priority  level. 


Priority  Level  1  ■ 

- 1  iiunij  i-uiui  V 

"'i^uui^^rvui  itl 


35  32  31  0 


tag 

RO 

tag 

R1 

tag 
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base  AO 

length 
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1 

1 
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base  A1 

length 
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1 

1 

0 

i 

base  A2 

length 

1 

1 

1 

D 

0 

base  A3 

length 

1 

1 

1 

n 

■ 

■ 

■■1 

■ 

m 

1 

35  32  31  0 


IDO 

iPI 

ID1 

1D2 

ID3 

Hi 

FIP 

u 

1  ■ "  n 

n 

□ 

I'l  ^ 

iLi 

35  32  31  16  0 


35  32  31  0 


1^ 

FOPO 

Bl 

FOP1 

2829  10  9  0 


F 

base  QBM 

mask 

head  QHL 

length 

20  0  15  0 

MAR 


Figure  D-1.  The  MDP  Register  Set. 


NNR 
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10  9 


base  TBM 


mask 
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Data  Types 

The  data  types  that  may  be  used  in  a  word  are  shown  in  Figure  D-2.  All  data  types  except 
FUT  and  CFUT  may  be  moved,  compared  with  EQ  and  NEQ,  XLATEd  and  ENTERed,  RTAGged, 
WTAGged,  CHECKed,  and  executed.  Executing  a  non-lNST  word  causes  it  to  be  loaded  into  RO. 
Some  data  types  allow  additional  operations,  which  are  listed  in  detail  in  the  description  of 
the  instruction  set. 


SYM 
INT 
BOOL 
ADDR 
IP 

MSG 
CFUT 
FUT 
TAGS 
TAG9 
TAGA 
TAGB 
INSTO 
1NST1 
INST2 
INST3 

Figure  D-2.  The  MDP  Data  Types. 

•  SYM  contains  an  atomic  symbol,  equal  and  nequal  are  allowed  on  SYMbols.  It  the  data 
portion  of  a  symbol  contains  all  zeroes,  the  word  takes  on  the  value  of  nil.  Cosmos  re¬ 
names  SYM  as  TAGO  and  inserts  a  subtag  in  bits  28  through  31  to  distinguish  between  a 
few  more  types. 

•  INT  contains  a  two’s  complement  integer  between  ■2^'^  and  inclusive.  All  arith¬ 

metic,  logical,  and  comparison  operations  are  allowed  on  ints. 

•  BOOL  contains  a  boolean  value,  which  is  either  true  (b=l)  or  false  (b=0).  All  logical  and 
comparison  operations  are  allowed  on  bools;  false  is  considered  to  be  less  than  true. 

•  ADDR  contains  a  base/length  pair  that  may  be  loaded  into  either  one  of  the  address  regis¬ 
ters  or  QBM,  QHL,  or  TBM.  The  uses  of  bits  30  and  31  vary  among  these  registers. 

•  IP  contains  a  value  appropriate  for  loading  into  the  ip. 

•  MSG  is  the  header  of  a  message.  It  is  similar  to  an  ip.  Due  to  a  shortage  of  tags,  Cos¬ 
mos  also  uses  this  tag  under  the  name  obj  as  an  object  header. 

•  CFUT  contains  a  context  future.  Almost  all  operations  fault  on  context  futures.  They  are 
not  meant  to  be  MOVEable.  cfuts  are  used  as  placeholders  for  unavailable  values  to  be 
computed  in  parallel  by  other  processes;  an  attempt  to  read  a  cfut  before  its  value  is 
available  will  fault,  and  the  operating  system  will  suspend  the  current  process  until  the 
value  is  available. 


33332  11  1 

52109  76  09  0 


0  0  0  0 

value  (0=NIL) 

0  0  0  1 

two’s  complement  value 

0  0  10 

0  ...  Ob 

0  0  11 

r 

i 

base 

length 

0  10  0 

u 

f 

offset 

p  a  0  ...  0 

0  10  1 

u 

f 

offset 

length 

0  110 

user-defined 

0  111 

user-defined 

10  0  0 

user-defined 

10  0  1 

user-defined 

10  10 

user-defined 

10  11 

user-defined 

1  1 

0  0  first  instruction 

second  instruction 

1  1 

0  1  first  instruction 

second  instruction 

1  1 

1  0  first  instruction 

second  instruction 

1  1 

1  1  first  instruction 

second  instruction 
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•  FUT  is  a  standard  future,  futs  may  be  moved,  and  their  tags  may  be  read  and  written, 
but  they  may  not  participate  in  any  primitive  operations  such  as  addition  or  checking  for 
equality.  As  with  cfuts,  an  attempt  to  use  a  fut  in  a  primitive  operation  will  cause  a 
fault,  and  the  operating  system  will  have  to  provide  the  appropriate  value  for  the  fut. 

•  TAGS  through  tags  are  tags  for  operating  system-defined  words.  They  cause  faults  on 
all  primitive  operations  except  eq,  neq,  bnil,  and  bnnil.  Cosmos  renames  these  tags 
as  ID,  DID,  TAGA,  and  float,  respectively. 

•  INSTO  through  inst3  are  tags  for  instructions.  The  two  instructions  in  a  word  occupy  a 
total  of  34  bits,  so  two  tag  bits  are  also  used  to  encode  them. 

Network  Interface 

Incoming  messages  are  queued  in  message  queues  before  being  dispatched  and  processed. 
There  are  two  message  queues,  one  for  each  priority  level.  When  a  message  arrives,  register 
A3  is  set  up  to  point  to  it  in  the  message  queue,  and  execution  begins  at  the  address  specified 
by  the  message  header.  A  message  may  be  processed  as  soon  as  its  first  word  arrives;  the 
processor  does  not  wait  until  the  entire  message  is  present  before  processing  it.  Memory  ac¬ 
cesses  to  the  message  are  checked  to  make  sure  that  the  processor  does  not  try  to  access  a 
word  in  the  message  before  it  arrives;  if  the  processor  tries  to  access  a  word  too  early,  it  waits 
until  the  word  has  arrived. 

The  SUSPEND  instruction  informs  the  hardware  that  the  processing  of  the  current  message  is 
done  and  that  it  should  fetch  the  next  message. 

Message  Transmission 

The  SEND,  SEND2,  SENDE,  and  SEND25  instructions  are  used  to  send  messages.  The  first 
word  sent  specifies  the  node  number  of  the  destination  node  (i.e.  the  destination  node’s  NNR 
value)  in  the  low  16  bits.  The  SEND  instruction  will  use  the  current  node’s  NNR  and  the  desti¬ 
nation  node  number  to  find  the  relative  offsets  in  the  X  and  Y  dimensions  that  the  network 
controllers  will  use  in  routing  the  messages  through  the  network.  There  are  actually  two  fla¬ 
vors  of  each  send  instruction:  SENDO,  SEND20,  SENDEO,  and  SEND2E0  send  words  of  priority 
0  messages,  while  SENDl,  SEND21,  SENDEl,  and  SEND2E1  send  words  of  priority  1  messages. 
The  priority  of  the  message  is  independent  of  the  priority  of  the  process  that  is  sending  it. 

The  initial  routing  word  is  followed  by  a  number  of  words  which  the  network  delivers  verba¬ 
tim  to  the  destination  node.  The  network  does  not  examine  the  contents  of  these  words.  The 
message  is  terminated  by  a  SENDE  or  SEND2E  instruction,  which  send  the  last  one  or  two,  re¬ 
spectively,  words  of  it  and  inform  the  network  to  actually  transmit  the  message.  The  first 
word  that  arrives  at  the  destination  node  (the  second  word  actually  sent  since  the  routing 
word  is  only  used  by  the  network  and  doesn’t  arrive  at  the  destination  node)  must  be  tagged 
MSG.  It  contains  the  length  of  that  message  including  that  word  but  not  including  the  routing 
word  preceding  it.  It  also  contains  the  initial  value  of  the  IP  at  which  execution  is  supposed 
to  start.  The  destination  node  will  fault  MSG  if  this  word  is  incorrect. 

The  total  time  between  the  first  SEND  and  the  SENDE  should  be  as  short  as  possible  to  avoid 
blocking  the  network.  For  the  same  reason,  faults  should  be  avoided  while  sending. 

Fault  Processing 

When  a  fault  occurs,  the  instruction  that  caused  the  fault  is  saved  in  the  FIR  repster,  the 
current  IP  (which  points  one  instruction  beyond  the  faulting  instruction)  is  saved  in  the  FIP 
register,  and  the  values  of  the  instruction  operands,  if  any,  are  saved  in  the  FOPO  and  FOPl 
registers.  If  the  fault  occurred  while  fetching  an  instruction,  the  FIR  is  set  to  NIL  and  the 
FIP  points  to  the  instruction.  The  IP  is  then  fetched  from  the  memory  location  whose  ad¬ 
dress  is  equal  to  the  fault  number  plus  the  base  of  the  fault  vector  table  of  the  current  prior¬ 
ity.  If  the  F  bit  was,  the  IP  is  loaded  from  the  CATASTROPHE  vector  instead.  The  U,  A,  and  F 
flags  receive  their  new  values  from  the  loaded  IP.  The  faults  are  listed  in  Table  D-1. 
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Table  D-1.  MDP  Faults 


Name 

Fault 

Number 

Description 

CATASTROPHE 

$00 

Double  fault, bad  vector,  or  other  catastrophe. 

INTERRUPT 

$01 

Interrupt  pin  has  gone  active. 

QUEUE 

$02 

Message  queue  about  to  overflow. 

SEND 

$03 

Send  buffer  full. 

ILGINST 

$04 

Illegal  instruction. 

DRAMERR 

$05 

Double  bit  error  in  the  external  RAM. 

INVADR 

$06 

Attempt  to  access  data  through  address  register  with  i  bit 
set. 

LIMIT 

$07 

Attempt  to  access  object  data  past  limit. 

ADRTYPE 

$08 

Index  in  indexed  addressing  mode  not  tagged  int. 

EARLY 

$09 

Attempt  to  access  data  in  message  queue  before  it  arrived. 

MSG 

$0A 

Bad  message  header. 

XLATE 

$0B 

XLATE  missed. 

OVERFLOW 

$0C 

Integer  arithmetic  overflow. 

CFUT 

$0D 

Attempted  operation  on  a  word  tagged  cfut. 

FUT 

$0E 

Attempted  operation  on  a  word  tagged  fut. 

TAG8 

$0F 

Attempted  operation  on  a  word  tagged  tags. 

TAG  9 

$10 

Attempted  operation  on  t  not6  tagged  tag9. 

TAGA 

$11 

Attempted  operation  on  a  word  tagged  taga. 

TAGS 

$12 

Attempted  operation  on  a  word  tagged  tags. 

TYPE 

$13 

An  operand  or  a  combination  of  operands  with  a  bad  tag 
type  used  in  an  instruction. 

$14-$1F 

Reserved  for  future  faults. 

If  multiple  faults  occur  slmultaneou;;iy  the  fault  vector  chosen  is  the  one  that  has  the  highest  precedence.  Each 
fault  is  assigned  a  preoedenoe  by  its  fault  number,  lower  fault  numbers  correspond  to  higher  precedence. 


Instruction  Encoding 

The  program  executed  by  the  MDP  consists  of  instructions  and  constants.  A  constant  is  any 
word  not  tagged  INSTO  through  INST3  that  is  encountered  in  the  instruction  stream.  When 
a  constant  word  is  encountered,  that  word  is  loaded  into  RO  and  execution  proceeds  with  the 
next  word  (the  assembler  syntax  for  including  a  word  in  the  code  stream  is  DC). 

Every  instruction  is  17  bits  long.  Two  17-bit  instructions  are  packed  into  a  word.  Since  a 
word  has  only  32  data  bits,  two  tag  bits  are  also  used  to  specify  the  instructions.  The  in¬ 
struction  in  the  high  part  of  the  word  is  executed  first,  followed  by  the  instruction  in  the  low 
part  of  the  word.  As  a  matter  of  convention,  if  only  one  instruction  is  present  in  a  word,  it 
should  be  placed  in  the  high  part,  and  the  low  part  of  the  word  set  to  all  zeros. 

The  format  of  an  instruction  is  as  follows: 


16  11  10  9  8  7  6  0 


2nd 

1st 

Opcode 

reg 

reg 

Addressing  mode 

# 

n 

op2 

opt 

apC 

The  opcode  field  specifies  one  of  64  possible  instructions.  The  other  fields  specify  three 
operands:  instructions  that  don’t  require  three  operands  ignore  some  of  the  operand  fields. 
Operands  1  and  2  must  be  data  registers;  their  numbers  (0  through  3)  are  encoded  in  the  1st 
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reg  #  and  2nd  reg  #  Helds.  Operand  2,  if  used,  is  always  the  destination  of  an  operation  and 
operand  1,  if  used,  is  always  a  source. 


6  0 


Normal 

Addressing  Moc 

_ 1 _ 1 _ 1 _ 1 _ 

je 

_ 1 _ 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0  0 

0 

0 

0 

1 

0 

0  1 

0 

0 

0 

1 

0 

1  0 

0 

0 

0 

1 

0 

1  1 

0 

0 

0 

1 

1 

0  0 

0 

0 

0 

1 

1 

0  1 

0 

0 

0 

1 

1 

1  0 

0 

0 

0 

1 

1 

1  1 

0 

0 

1 

Rx 

An 

0 

1 

imm 

1 

imm 

An 

Syntax 

Addressing  Mode 

Rn 

Data  register  Rn 

An 

Address  register  An 

NIL 

Immediate  constant  NIL  (SYM:0) 

FALSE 

Immediate  constant  FALSE  (BOOL:0) 

TRUE 

Immediate  constant  TRUE  (BOOL:1) 

$80000000 

Immediate  constant  INT  $80000000 

$FF 

Immediate  constant  INT:$000000FF 

$3FF 

Immediate  constant  INT:$000003FF 

$FFFF 

Immediate  constant  INT;$00O0FFFF 

$FFFFF 

Immediate  constant  INT:$000FFFFF 

[Rx,  An] 

Offset  Rx  in  object  An 

imm 

Immediate  imm  (signed) 

[ imm, An] 

Offset  imm  (unsigned)  in  object  An 

Fi^re  D-3.  The  MDP  Normal  Addressing  Modes. 

The  immediate  constants  are  eight  immediate  values  outside  the  range  They  are  provided  for 

convenience  and  code  density  improvement.  The  $FF  and  $FFFF  constants  are  useful  for  masking  bytes  and 
words,  while  the  $3FF  and  $FFFFF  constants  may  be  used  for  masking  lengths  and  addresses. 


Operand  0  can  be  used  as  a  source  or  a  destination  in  an  instruction.  It  can  hold  two  possible 
encodings.  A  normal  instruction  has  opO  address  mode  encodings  as  shown  in  Figure  D-3. 
The  register-oriented  opO  mode  is  used  only  by  three  variants  of  the  MOVE  instruction.  If  an 
instruction  uses  the  register-oriented  opO,  the  encodings  are  as  in  Figure  D-4. 


Instruction  Set  Summary 

The  instructions  supported  by  the  MDP  are  summarized  in  Table  D-2.  The  Types  column 
specifies  the  types  on  which  the  instruction  operates;  if  the  arguments  have  different  types, 
the  instruction  faults.  Except  for  a  MOVE  to  memory,  all  instructions  fault  when  any  of  their 
operands  are  tagged  CFUT.  Also,  except  for  moves  and  SENDS,  all  instructions  fault  when  any 
of  their  operands  are  tagged  FUT. 
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6  0 

Syntax  Addressing  Mode 

Rn  Data  register  Rn 

An  Address  register  An 

iDn  ID  register  IDn 

Fip  Trapped  Instruction  pointer 

FIR  Trapped  Instruction  register 

FOP'O  Trapped  OPO  register 

FOPi  Trapped  OP  1  register 

QBM  Queue  Base/Mask  register 

QHL  Queue  Head/Length  register 

IP  Instruction  Pointer  register 

TBM  Translation  Base/Mask  register 

NNR  Node  Number  register 

mar  Memory  Address  Bus  register 

Unused  (ILGINST  fault) 

Unused  (ILGINST  fault) 

P  Priority  Level  flag 

B  Background  Execution  flag 

I  Interrupt  flag 

F  Fault  flag 

u  Unchecked  flag 

Q  A3  Queue  flag 

Unused  (ILGINST  fault) 

Unused  (ILGINST  fault) 

Figure  D-4.  The  MDP  Register  Oriented  Addressing  Modes. 

B  and  P  represent  the  priority  of  the  register  being  accessed  XORed  with  the  current  priority  For  example,  00 
indicates  the  current  priority,  while  01  would  let  priority  1  access  priority  O's  registers,  and  1 1  would  let  priority  1 
access  the  background  registers.  The  assembler  syntax  for  specifying  a  register  belonging  to  the  other  priority  is 
the  register  name  followed  by  a  b  to  flip  the  B  bit  and/or  a  backquote  (  )  to  flip  the  P  bit 

Table  D-2.  MDP  Instructions 


instruction  Brief  Description  Types 

Generai  Movement  and  Type  Instructions 

MOVE  src,Rd  Rd  <- Src.  Src  may  be  3  register  addressing  mode.  All 

MOVE  Rs,  Dst  Dst «-  Rs.  Dst  may  be  a  register  addressing  mode.  All 

MOVE  Src,iP  ip«-Src.  srcmaybearegisteraddressingmode.  All 
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RTAG 

Src, Rd 

Rd<-  iNT:tag(Src) 

All 

WTAG 

Rs, Src, Rd 

Rd«—  SrclRs 

Ail 

CHECK 

Rs,  Src, Rd 

Rd  <-  BOOL:tag(Rs)=Src 

All 

Arithmetic  and  Logicai  instructions 

NEG 

Src, Rd 

Rd< — Src 

I  NT 

ADD 

Rs, Src, Rd 

Rd«—  Rs+Src 

I  NT 

SUB 

Rs,  Src, Rd 

Rd«—  Rs-Src 

I  NT 

CARRY 

Rs, Src, Rd 

Rd  <-  Carry  from  the  addition  of  Rs  and  s  rc 

I  NT 

MUL 

Rs ,  S  rc ,  Rd 

Rd<—  Rs'Src 

I  NT 

MULH 

Rs,  Src,  Rd 

Rd  ♦-  High  32  bits  of  64-bit  unsigned  product  of  Rs  and  s  r  c 

INT 

ASH 

Rs,  Src,  Rd 

Rd  *-  Rs  shifted  left  arithmetically  by  Src  bits 

INT 

LSH 

Rs,  Src,  Rd 

Rd  <-  Rs  shitted  left  logically  by  Src  bits 

INT 

ROT 

Rs,  Src,  Rd 

Rd «-  Rs  rotated  left  by  Src  (mod  32)  bits 

INT 

FFB 

S  rc , Rd 

Rd «-  31 -position  of  leftmost  bit  of  Rs  differing  from  bit  31. 

INT 

NOT 

Src, Rd 

Rdf- NOT  Src 

INT,  BOOL 

AND 

Rs, Src, Rd 

Rdf—  Rs  AND  Src 

INT,  BOOL 

OR 

Rs, Src, Rd 

Rd  f—  Rs  OR  S  rc 

INT,  BOOL 

XOR 

Rs, Src, Rd 

Rd  f—  Rs  XOR  Src 

INT,  BOOL 

LT 

Rs ,  S  rc ,  Rd 

Rdf-  BOOL;Rs<Src 

INT,  BOOL 

LE 

Rs,  Src, Rd 

Rdf-  BOOL:Rs<Src 

INT,  BOOL 

GT 

Rs, Src, Rd 

Rd  f-  BOOL;Rs>Src 

INT.  BOOL 

GE 

Rs, Src, Rd 

Rdf-  BOOL;Rs>Src 

INT,  BOOL 

EQUAL 

Rs, Src, Rd 

Rdf—  BOOL;Rs=Src 

SYM,  INT,  BOOL 

NEQUAL 

Rs, Src, Rd 

Rd  f—  BOOLlRs^Src 

SYM,  INT,  BOOL 

EQ 

Rs, Src, Rd 

Rdf-  BOOL:Rs=Src  (Pointer Comparison  only) 

All 

NEQ 

Rs,  Src, Rd 

Rd  f-  BCOLiRs^tSrc  (Pointer  comparison  only) 

All 

Network  Instructions 

SEND 

Src 

Send  Src  onto  the  network 

All 

SENDE 

Src 

Send  Src  onto  the  network  and  terminate  message 

All 

SEND  2 

Rs, Src 

Send  Rs  and  Src  onto  the  network 

All 

SEND2E 

Rs, Src 

Send  Rs  and  Src  onto  the  network  and  terminate  message 

All 

Associative  Lookup  Table  Instructions 

XLATE 

Rs, Dst, C 

Dst  f-  associative  lookup  in  the  associative  lookup  table  of  Rs 

All 

ENTER 

Src, Rs 

Enter  (src,  Dst)  into  the  associative  lookup  table 

All 

PROBE 

Src, Rd 

Rd  f-  bool;s  rc  is  in  the  associative  lookup  table 

All 

Special  Instructions 

NOP 

No  operation 

INVAL 

Invalidate  all  relocatable  address  registers 

SUSPEND 

Terminate  current  process  and  letch  another  message 

CALL 

Src 

Call  system  routine  numbered  src 

Branches 

BR 

Src 

Branch  forward  Src  words 

BNIL 

Rs, Src 

Branch  forward  Src  words  if  Rs  is  Ni  L 

Ail 

BNNIL 

Rs, Src 

Branch  forward  Src  words  if  Rs  is  not  ni l 

All 

BF 

Rs, Src 

Branch  forward  Src  words  if  Rs  is  false 

BOOL 

BT 

Rs, Src 

Branch  forward  s  rc  words  if  Rs  is  true 

BOOL 

BZ 

Rs, Src 

Branch  forward  s  rc  words  if  Rs  is  zero 

INT 

BNZ 

Rs, Src 

Branch  fonward  Src  words  if  Rs  is  non-zero 

INT 
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Appendix  E.  Optimist  II  Listing 

This  listing  has  been  removed  due  to  space  constraints.  For  a  copy  of  the  source,  please  send 
mail  to  waldemar@ai  .mit  .edu  or  billdSai  .mit.edu.  A  slightly  older,  printed  copy  of  the 
source  can  also  be  found  in  the  original  Ivlaster’s  thesis  version  of  this  document. 
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Cosmos.! 


HOP  Op«r«tinq  System 

version  2.3 

written  by 
Wsldemsr  Horwst 

Hester's  thesis  under  Prof.  William  Dally 

Harch  28,  1989 
Hay  1991 

Send  problems  and  comments  to 
waldemarf hx . Ics .mit . edu . 

Copyright  1989,  199Q,  1991  Waldemar  Horwat 


CASE  ON 


:  I  Parameters  i 

- ....... - - - 

.’These  parameters  are  used  to  customize  Cosmos.  You  can  override  the  deiauit  settings  of 
.‘these  parameters  by  using  ~d  PEALMODE*!,  etc.  commano  line  options. 

.•REALHODE  is  true  if  the  code  should  be  compiled  for  a  real  J-Machire  instead  of  HDPSim. 
.’This  turns  off  the  STOP  instruction  ithis  means  you  can*t  use  RUN). 

IF  IDefined (REALHODE) 

LABEL  REALHODE  -  0 
END 

.’FASTSIM  is  true  if  the  loop  that  clears  memory  to  CFUTa  sr.oulo  be  sxipped. 

IF  !Defined(FASTSIM) 

LABEL  FASTSIH  -  1 


.'DEBUG  is  true  if  extra  debugging  code  should  be  run. 
IF  ;Defined(0£BUG) 

LABEL  DEBUG  •  1 
END 


; Equates 


LABEL  LoqNNooes  *•  LGNNODES 
LABEL  NNodes  •  l<<LogNNodes 

LABEL  nFa stContext s  -  8  .-Number  of  fast  context  to  allocate. 


I  Hemory  Map 


LABEL  GlobalsOStart  -  0 

LABEL  Globa IsOEnd  -  $40 

LABEL  GlobalslStart  -  $4n 

LABEL  GlobalslEnd  -  $80 

LABEL  ADR  FaultsOStart  -  $80 

LABEL  ADR  FaultsOEnd  •  $A0 

LABEL  ADR  FaultslStart  -  SAC 

LABEL  ADR  FaultslEnd  -  SCO 

LABEL  ADR  CailsStdrt  •  SCO 

LABEL  ADR  CallsEnd  -  $100 

LABEL  ADR  QueuelStart  -  $100 

LABEL  ADR  QueuelEnd  -  $100 

LABEL  ADR  QueueOStart  -  $100 

LABEL  ADR  QueueOEnd  -  $?0C 

LABEL  ADR  XlateStart  >  $200 

LABEL  ADR  XIateEnd  -  $400 

LABEL  BRAT Len Log  -  6 

LABEL  BBATf.enqth  -  1  <<BRATLenLcg 

LABEL  ADR  BRATStart  -  $400 

LABEL  ADR  BRATEnd  -  BRATStart ♦BRATLength 

LABEL  ADR  HeapEnd  -  MF.MS1ZF. 


I  Tags 


LABEL  TAG  TACO  •*  0  .Immediate  object  tag. 

LABEL  TAG  OBJ  *  MSG  .'Objects  and  messages  have  the  same  tag. 

LABEL  TAG  CS  -  INSTl  .Cl  a s a / se i eel  or . 


185 


Concurrent  Smalltalk  on  the  Message-Driven  Processor 


;Subtaqa  of  TAGO: 

LABEL  subtagN  •  21 

LABEL  •ubtaqL  -  4 

LABEL  •ubtaqM  -  ( l<<subtagL) -1 

LABEL  tubSYH  •  0 

LABEL  tUbCLASS  -  1 

LABEL  tubSEL  -  2 

LABEL  «UbCHAB  •  3 


;Subt49  offset . 
.'Subtag  length. 
; Subtag  maaii. 

: Symbol . 

:Claaa. 

; SaXector . 
.’Character . 


I  Type# 


.‘Addreee  blta 

LABEL  XengthN  •  0 

LABEL  XengthL  -  10 

LABEL  XengthM  -  ( X <<lengthL) -1 

LABEL  baaeN  -  10 

LABEL  baaeL  -  20 

LABEL  baaeM  -  (l«baaeL)-l 

LABEL  InvaXldN  -  30 

LABEL  Invalid  -  l«lnvaXidN 

LABEL  relN  •  31 

LABEL  rel  »  l«reXN 

LABEL  dlaableN  -  30 

LABEL  disable  -  l<<disabXeN 

;IP  bits 

LABEL  abaN  «  I 

LABEL  aba  •  l«absN 

LABEL  phaaeN  -  9 

LABEL  phase  -  l«phaseN 

LABEL  offsetN  -  10 

LABEL  offaetL  •  20 

LABEL  faultN  -  30 

LABEL  fault  -  l<<faultN 

LABEL  uncheckedN  -  31 

LABEL  unchecked  -  1 <<uncheckedN 


.'Length  field  offset. 
jLenqth  field  length. 
.’Length  field  mask. 

.'Base  field  offset. 

.‘Base  field  length. 

.'Base  field  mask. 

: Invalid  address . 

.‘Relocatable  address. 

.'Disable  bit  of  QBH  regs. 


.’Absolute  IP. 

; IP  phase  bit. 

.-Offset  field  offset. 
.'Offset  field  length. 
.'Fault  flag. 

.'Uncneciceci  mode  flag. 


.-ID  bits 

LABEL  homeNodeN  -  0  .'home  node. 

LABEL  homeNodeL  >  16 

LABEL  homeNodeH  -  < 1 <<homeNodeL) >1 

LABEL  serialN  •  16  .-Serial  numoer  . 

LABEL  serialL  « 

LABEL  serialM  «  < 1 <<serialL) 

LABEL  distobjHemberS'  ••  31  .-Distributed  object  member 

LABEL  distobjHember  ••  1  <<di  stobjMemberN 


f  lag. 


.'DID  bits 

LABEL  initialNoaeN  •  0  .-Initial  node. 

ZJ^BEL  initialNodeL  -  11 

LABEL  InltialNodeM  -  ( 1 «ini t ialNodeL) -1 

LABEL  logStrldeN  -  11  ;2's  complement  Ig ( ♦nooes/lconst ituent s) 

LABEL  logStrideL  -  5 

LABEL  logStrideM  •  ( 1 <<logSt r IdeD - 1 


.'Class/Selector  bits 

LABEL  csSelectorN  -  0  .'Selector. 

LABEL  csSelectorL  •  16 

LABEL  csSelectorM  •  ( l<<csSelectorLI ->1 

LABEL  csClassN  •  16  .'Class. 

LABEL  csClassL  -  16 

label  CSClassH  •  tA . 


;xyz  bits 
LABEL  xN  -  0 
LABEL  XL  -  LGXNODES 
LABEL  XM  -  (1<<XL)-1 
LABEL  xMC  -  |1«S-XL)-I 
LABEL  yN  -  3 
LABEL  yL  -  LGYNODES 
LABEL  yM  -  U<<yLl-1 
LABEL  yMC  -  Il<<5-yL)-l 
LABEL  zN  -  10 
LABEL  zL  •  LGZNODES 
LABEL  zM  •  (1<<ZL)'1 
LABEL  zHC  -  U<<6-ZL)-1 


X  field  off  set . 

X  field  length. 

X  field  mask. 

X  complement  field  mask. 

Y  field  off  set . 
y  field  length. 

Y  field  mask. 

r  co.•T^plerrent  field  mask. 
Z  field  off  Set . 

Z  field  lenqth- 
Z  field  mask. 

Z  complement  field  mask. 


.-These  constants  are  used  to 

LABEL  mX  •  xH 

LABEL  sX  •  0 

LABEL  mY  -  yM<<xL 

LABEL  SY  •  yN-xL 

LABEL  mZ  •  zH<<x:.’y;. 

LABEL  sZ  -  zN-xI.-yL 
LABEL  mS  •  ser la :m< <xL » y L* r L 
LABEL  sS  •  serialN-xL-yL-zL 
;The  nth  object  is  stored  at 


fashion  serial  and  node  n-moers  for  precom.pileo  objects. 


inimX)  <<sx  (nimYK<sY  ir.fcT.i  »  <<s2  jnams  *  <.<ss 


.'These  constants  are  used  to  fashion  numoers  fer  precompilfo  classes  a.na  se.ei.'icrs 
; so  as  to  distribute  tnem  evenly  throughout  the  J-Macninc. 

LABEL  m3  «  xMC< <xL» y L ♦ ? L 

LABEL  s3  -  -yL-zL 

LABEL  m4  •  yMC< <xN’ yL» ? L 

LABEL  04  •  -ZL 

LABEL  mS  •  zMC<<xN«yNA ZL 

LABEL  sb  •  0 

.'The  nth  object  is  stored  at  (n*frXM<sX  i  (nsmy  k<s  r  rnamz  »<<sZ  >  rna.mji  <«  s3  .  (.’lima  »  <<  s4  -  <<5b 


LABEL  nodeHask  -  zM<<7*»iyM<<yN  xM<<xN  ;Masl(  for  qcr.cratinq  ra’ocim  r.oae  n..'r.r*ers. 


LABEL  RandomSeedIncrement  -  b<<zN-? i 3<<yN-l  ’  1 < <xN 
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.-Hardwired  classes: 

LABEL  classPr imit iveClass  -  i2  4mx) <<sX i  (24mYl<<sY ■ (24m^) <<sZ  t  <24m3) <<#3 i {2  4^4  ><<b4i  i24mbi<<sb 
LABEL  classStandardClass  -  (34mX)<<sXi  (34ibY)<<sYI  (34mZ)<<tZi  (34m3)<<s3i  <34»4I<<b4i  <34iiib><<sb 
LABEL  classDist.  IbutedClass  -  (44mx  )<<ax  i  (44mY)  <<sY  i  i44mZ)<<sZ  i  t44ffi3M<s3  i  (44m4 1  <<s4  i  (44si!>  )<<sb 
LABEL  ClassObject  -  <b4mX)  <<SX  l  (b4mY)  <<SY  >  (S4mZ)<<BZ  i  (&4m3)<<B3  I  |b4fB4  ><<a4  i  (b4iibl  <<Sb 
LABEL  classNull  -  (e4t&X}  <<sX  i  (64mY )  <«sY  I  (64mZ)  «sZ  i  (64m3)<<s3  i  (64fli4 )  <<s4  i  (64mS  )<<sb 
LABEL  classSymbol  -  n4iiX)<<sX!  |74iiiY)<<sYi  (74mZ)<<sZ:  C'’4m3)<<s3i  (74n4)<<s4]  (74nbi<<sb 
LABEL  ciassCiass  -  (lainXl  <<sX  |  (l4mY)  <<sY  !  <  l4mZ )  <<tZ  i  t44m3)<cs3  Ml4m4  )<<s4  Ml4mb  )<<sb 
LABEL  ClassSelector  -  ( 94mX)<<sX  l  i  94fnY)  <<SY  M94mZ  )<<sZ  i  (94iii3)  «s3  1 94m4)  <<s4  i  i  94mS)<<Bb 
LABEL  classCharacter  (1  04aX)<<sX  !  ( 1 04nY)<<sY  i  (I04mZ)<<sZ  i  ( 1 04ffi3)  <<s3  It  I04m4  )  <<s4  i  1 1 04irb )  <<sb 
LABEL  class  Integer  •  til  4mX)  <<sx  i  ( lUmY)  <<sY  i  1 1  14rZ)  <<sZ  i  (1  Um3)  <<s3  i  1 1 14m4  t<<s4i  (ll4mb)<<sb 
LABEL  classBoolean  »  1 124mXl  <<SX  i  ( 1 24mY )  <<SY  i  1 124mZ)  <<sZ  i  t  124R!3)  <<s3  '  1 124m4  )  <<b4  i  ( 1 24nb  )<<  sb 
LABEL  classFalse  -  1 1 34mX)  «SX  I  ( 1 34mYi  «SY  >  ( 134mZ  t  «sZ  i  1 1 34>fc3)  «s3  -  ( 134m4 )  <<s4  l  1 1 34mb )  c<sb 
LABEL  classTrue  -  ( 144fflX)<<aXl  <  l44mY)  <<sY  i  (144m2)<<sZ  i  (144m3)  <<s3  <  144fn4i  <<:s4[  (144mb><<ib 
LABEL  classFloat  -  ( 1  b4RiX)<<aX  i  ( 1  b4mY )  <<sY  i  <  1  b4mZ)  <<sZ  i  ( 1  b4m3 1  <<s3  ( 1  b4tn4 )  <<s4  f  ( 1  b4inb  »  <<sb 
LABEL  classFunct ion  -  1 164mX) <<sx i  1 164mY) <<sY i (164mZ)<<sZ • 1 164m3i <<s3 : (164m4 )<<s4 i  ( i64mb t  <<sb 


I  Objects 


LABEL  objectHeader  >  0 
LABEL  objectlD  -  1 

.’Object  header  bits: 

LABEL  hdrLengthN  -  0 

LABEL  hdrLengthL  -  10 

LABEL  hdrLenqthM  -  ( l<<hdr LenqthL) - 1 : 

LABEL  hdrClassN  -  10 

LABEL  hdrClassL  -  16 

LABEL  hdrClassM  -  ( 1 <<hdrClassL) >1 ; 

LABEL  hdrFastN  -  26 

LABEL  hdrFast  -  l<<hdrFastN 

LABEL  hdrDeletedN  -  7‘i 

LABEL  hdrOeietea  -  l<<narDeletedN 

LABEL  hdrCopyableS'  -  26 

LABEL  hdrCopyable  -  1 <<harCopyabieK 

LABEL  hdrPurqeableN  -  29 

LABEL  hflrPurgeable  -  1 <<hdrPurqeab.eN 

LABEL  hdrLockedN  -  30 

LABEL  harLocxed  •  1 < <hd r LocxedN 

LABEL  hdrMarkedS  -  31 

LABEL  harMarited  -  1  <<hdrMa r iteaN 


.-Length  field  offset. 

.-Length  field  -enqth. 

.-Length  field  icasK. 

; Class  field  off  set . 

.-Class  field  length. 

.’Class  field  mass. 

.-Fast  context- 

; ^  ree  eb'eci . 

:  Im.’T.s.taole  ccpyacle  cb;ect. 

; P-rqeable  object . 

; LocKeo  object . 

.-P-jrgeacie  object  ir.aracd  by  s*.eepe: 


.-Class  object  s ; 

LABEL  oClatsMcrd  *•  2 
LABEL  QCla ssNAl ISuper s  •  3 
LABEL  oClas sAl 1  Super s  -  4 


header  word  for  objects  of  c.ass. 

Count  of  a*l  superclasses  Icr  c.ass. 

list  of  al.  superclasses  for  this  c.ass. 


.-  Selectors : 

LABEL  oSelNHethoQs  -  2 
LABEL  oSelHethods  -  3 


-Nurr.ber  of  mctr.ods  -'’''ir.ea  for  tr-.-s  se.ector. 
'L;st  of  cl  ass.'iret  r.cc  pairs  for  this  selector 


; Funct i ons : 

LABEL  oFunct ionNArgs  •  ? 
LABEL  oFunctionCode  -  3 


rNutr.oer  of  arg-ments  or  s'.: 
. Coce  of  function 


.-Closures: 

LABEL  oClosureNArgs  •  2 
LABEL  oClosureCooe  •  3 
LABEL  oClosureFunct  •  4 
LABEL  oClosureDisplay  • 


Number  of  argumerius  or  k:; 
Faulting  instruction 
function  to  be  ca..ed. 
Additional  oisp.ay  arquher.ts. 


.-  Distob  j  s  : 

LABEL  oDi stobjGroup  -  2 
LABEL  oDi st Ob j I ndex  *  3 
LABEL  wDi  S' Ob  j  Lo'-' ic.»  1  Lim  i  t  «  4 


;DiL  cf  a  distributed  OD.ett. 

; Const i t .^ef:t  r.w.mt;er  of  a  ctur.st  i t ^er.t 

.Logical  n..mber  of  const  .  t -ent  s  in  a  distributed  object 


I  Contexts 


LABEL  contextHeader  •*  0 
LABEL  context  ID  *  1 

.-Context  message  and  louais  are  Ir.  Jocations  2  through  i  b . 

LABEL  contextRO  >  16 
LABEL  ContexlRl  "  17 
LABEL  COntexlB2  -  18 
LABEL  contextRS  -  19 
LABEL  contextIDO  -  20 
LABEL  contextID2  -  21 
LABEL  context ID3  -  22 
LABEL  context  IP  -  23 

LABEL  contextNext  -  24  .-Next  context  in  a  cna.r.. 

;A;so  used  to  store  Nil  cr  r^ext  context  nunoer  wneri  watting 
;  for  an  object,  cr  zero  •nen  waitir^g  fcr  a  cfuture. 

LABEL  contextsize  -  2b  .Size  of  a  fast  context 

.-More  locals  may  follow  here. 


Messages 


; Apply  message : 

LABEL  applyHeader  -  C 
LABEL  applyFunct  •  1 
LABEL  app. yRece: ver  -  2 

.-Reply  message: 

LABEL  rep.yheader  -  C 
LABEL  replylD  -  1 
LABEL  replyS.ot  -  2 
LAP  .L  replyval  je  -  ■» 


Cosmos.i 
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; RestArtContext  messxqe; 

LAB£L  -  0 

lABKL  rtstxrtlD  -  1 

;N*wObj«ct 

LABEL  RRtrObjHaxdar  -  0 
LABEL  nvwObjC’.xas  «  I 
LABEL  n«wO'  AaplylO  -  2 
LABEL  nvwOb3B*piy5Lot  -  3 

.-''lapost  ncaaaqt: 

a”  !  diapoa«H*ader  •  0 
1  jCL  diapoaalD  -  1 

;Dlapoa«BllAT  m«8sage: 
lABEL  dlapoaaBRATHeader  -  0 
LABEL  diapoaaBRATID  -  1 

;  L#ooi(upM«thod  meaaage: 

LABEL  LLooHupMdthOd  -  ID  :  I  1  <<30  l  (  OimXI  <<lX  I  {  OimY  J  <<SY  l  (C4Jt2>  <<»Z  i  tcams)  <<8£) 

REF  REV  l«oolcupHethod  ~  LL.ookupM«triod 

LABEL  lookMethSalecior  -  2 

LABEL  Ioo)iMathCIa88  -  3 

LABEL  looliMethReplylD  -  4 

; M«t hodRepIy  m*8sage : 

LABEL  mathodRaplyHeader  -  C 
LABEL  maChodRaplyLD  >  I 
LABEL  mathodRaplyValua  -  2 

; RaquaatOb jact  maaaaqe : 

LABEL  raqObjHaadar  -  Q 
LABEL  raqOb^lD  -  1 
LABEL  raqObjReply^oae  •  2 

.  vpoateHoma  nie88aqe: 

LABEL  updLHomaHeadc;  ~  C 
LABEL  updtHomelD  •  L 
LABEL  updtHomaNoda  -  2 

; Unlock  maaaaqa: 

LABEL  jniockHaadar  -  C 
LABEL  unlockID  *  I 


LABEL  fB8qAcknowledqaOb*ec*s  -  L<<c!Ls«fLN 


Clobal8 

LABEL 

^DR 

TampDiv_Count  -  4 

.•Divide  tem.porary. 

LABEL 

ADR 

LlmitOvarride  -  3 

;hll  or  IP  to  wn.ch  a  ::m;i  JawJt  s.-.Ow-c  5um.p  tone 

LABEL 

AIJR 

raatConlaxtQjeua  -  i 

;Queje  cf  fas:  coniexts 

LABEL 

ADR 

TajnpCH_RC  -  7 

. Compactheap  temporary  - 

LABEL 

ADR 

FirsiFree  -  6 

.Pointer  to  first  free  neap  word 

LABEL 

ADR 

LaatFree  -  9 

.'Pointer  to  last  free  heap  word  pL4.s  orie . 

LABEL 

ADR 

BRATFrae  •  LO 

.Pointer  to  free  BRAT  .inns- 

LABEL 

ADR 

LaatOb jeci I C  -  11 

;  ID  of  last  or.'iect  tc  be  al.ccateo- 

LABEL 

ADR 

MaxtCi  acoo  ;  i*  D  •  12 

; ID  of  next  dliirip-teo  object  ic  be  a..ocated. 

LABEL 

ADR 

SariaLNoda  •  13 

;Tnia  node's  serial  number 

LABEL 

ADR 

NodaMask  -  lb 

;Tne  nodewasK  constant. 

LABEL 

ADR 

HaapStart  •  16 

.Pointer  to  ine  beginr.in.q  of  tne  re.ocatao.e  neap 

LABEL 

ADR 

RandomSaea  •  17 

.Random  number  seed. 

LABEL 

ADR 

TampXLATE  RO  -  18 

-XLATP  fault  na'.dler  temporaries 

LABEL 

ADR 

TampXLATE'Rl  -  19 

LABEL 

ADR 

TampXLATE* R2  -  2C 

LABEL 

ADR 

TampXLATE'f IP  -  21 

LABEL 

ADR 

TampXLATE>  :s  -  22 

LABEL 

ADR 

TampXL-ATE  Tamp  -  23 

LABEL 

ADR 

TampOC  FIP  -  24 

.  Di  sposeUt.  -ect  :  empora  ry 

LABEL 

ADR 

TampNC  f:p  -  2S 

.NewContext  temporaries 

LABEL 

ADR 

TampNC  :D2  -  26 

LABEL 

ADR 

TempNC_R2  -  27 

LABEL 

ADR 

TempNC~R3  -  28 

LABEL 

ADR 

TempCH  F:p  -  29 

;  Compact  near.  ter^>orar;es- 

LABEL 

ADR 

TampCM*H2  -  3C 

LABEL 

ADR 

TempCH  R3  -  31 

LABEL 

ArR 

TempCH  A3  •  3? 

LABEL 

ADR 

TampCH'lD3  -  33 

LABEL 

ADR 

TempCH  Lock  •  34 

LABEL 

ADR 

TempCH  Src  -  3b 

LABEL 

ADR 

TempANC  FIP  -  36 

;  A 1  .  cc.  NewCb^eCt  te"‘pcfar> 

LABEL 

ADR 

TempEB  Key  •  37 

; tnt e r b i nd : nq  temporary 

LABEL 

ADR 

TampL>t  F:p  •  38 

;  Look  .^pMet  noo  temporaries 

LABEL 

ADR 

TempIsTTH  Conte*:  -  39 

;  I  r  .  t  ;  a  1  /eWDP  t e-pc  ra  r y  . 

LABEL 

ADR 

TempTOf  fTp  -  4C 

,  Cl  asst,  f  temporary  . 

LABEL 

ADR 

TempDlv‘R2  •  TempTOf  P P 

•Divior  lerporaries- 

LABEL 

ADR 

TempDiv^Rl  •  41 

LABEL 

ADR 

TempDiv* 8COCCCOC  -  42 

LABEL 

ADR 

TempNC^FlP  -  TerpDiv_»2 

;Newr,:,s.yre  te^pf^rary 

LABEL 

ADR 

Tem.pDeai.or  FIP  -  <3  ’ 

.  Dea  .  .  Cs.ati‘.' t  ’e-'t  te'^t-irury 
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I  Fault.  NumD«r» 


LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
IJVBEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  vfcCTUR 
LABEL  VECTOR 
LABEL  VECTOR 
LABEL  VECTOR 


susp«nd  -  $00 
blocKHov*  •  $01 
blockSvnd  -  $02 
compactHeap  •  $03 
allocObject  -  $04 
•nterBindinq  -  $05 
looKupBlndinq  •  $06 
deleteBindinq  -  $07 
purgeBlndinq  •  $08 
nawLocalObjcct  -  $09 
allocNextObject  -  SOA 
allocNewObject  -  $0B 
newContext  -  $0C 
dlsposeContext  -  SOD 
dispoieObjact  •  $0E 
de4 1  local eOb ject  «  SOF 
newObject  -  $10 
classOf  -  $11 
lypeOf  -  $12 
objectNoda  *  $13 
pre! erredConat Ituent  - 
ro  -  $15 

looKupHethod  «  $16 
ACORupMethodU  ■  $17 
divide  -  $11 
newCIoture  -  $19 
calicioaure  -  $1A 


$14 


XLATE  Fault  codes 


LABEL  obtectXI^TE  -  C 
LABEL  ir.ternalXLATE  -  1 
LABEL  lOCalXLATE  -  2 
LABEL  restoreXLATE  -  3 


Fird  arc  brirq  trie  object  nere. 

Saf^e  as  iocaiXLATE  but  also  woms  for  c-asses  ana  selectors. 

Return  me  cb'-eci  aaoress,  its  noae  rumoer,  or  ML  if  it  is  a  constant 
Restore  an  aaoress  reqis'.'*r  *roft  a  savea  ID  value. 


Halt  Cooes 


LABEL  na.t^a-.tC  -  0 
LABEL  naltfauit;  -  i 
LABEL  naltFuture  -  2 
LABEL  r.altOver  { 4  ow  •  3 
LABEL  ftaltType  -  4 
LABEL  haltLser  -  5 
LABEL  naURanqe  -  6 
LABEL  haltCa* .  -  7 
LABEL  nalt Invai ioAI  -  8 
LABEL  naltKeply  -  9 
LABEL  ha  ItUnlnitVar  -  13 
LABEL  haltTypeCf  -  14 
LABEL  haltXLATE  -  15 
LABEL  haltBRATType  -  18 
LABEL  haitBRATMissinq  -  17 
LABEL  naltBRATOelete  •  18 
LABEL  ha ItCIassType  •  19 
LABEL  ha  It  In: erna 1  Type  -  20 
LABEL  haltBRATFul 1  -  21 
LABEL  ha  ItMeir.Ful :  -  27 
LABEL  haltAppiy  -  23 
LABEL  haltHeap  -  24 
LABEL  haltLit.it  -  25 
LABEL  haUDivC  -  26 


:Genera-  priority  C 
.-Genera;  priority  ;  fa-.t. 

.Futures  are  not  iffip.enc.nted  yet. 

;Biqn-ms  are  not  imp.ef^entco  yet - 

.-Overridinq  bwilt^lr.  selectors  is  not  isp.etrer.teo  yet. 

.-Halt  by  user  proqra-T. 

.Ranqe  exceeded  in  a  prifrlt.ve  operation. 

;L‘ndefi.ned  system  ca*!. 

;A:  inva.id- 
; Reply  to  a  bao  Siot . 

.-An  uninitialized  variable  was  referencec. 

.-Nonexistent  or  incorrectly  taqqed  object  passed  to  typeOf. 
.-Nonexistent  or  incorrectly  taqqed  object  is  XLATEd. 

;An  Object's  BRAT  entry  is  rr.issinq  or  mistyped. 

.-An  object's  BRAT  entry  is  tr.tssinq. 

.Attempt  to  delete  a  missinq  BRAT  entry. 

.-Incorrectly  taqqed  word  ..sed  as  a  class. 

;A  non*CST-taqqed  wore  -seo  as  an  object. 

;Tne  BRAT  is  ful* . 

.•Memory  is  ful .  . 

.-Attempt  to  app.y  an  incorrect. y  taqqed  word. 

.-Heap  is  in  an  inconsistent  state  durinq  a  compaction. 

.-An  object's  limit  Is  exceeded. 

.-Division  by  rero. 
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Cosmos.m 


MDP  Operatirtg  System 

version  2.3 

written  by 
Waldemsr  Horwst 

Master's  thesis  under  Prof.  Hilliarr  Daily 

March  28.  1989 
May  1991 

Send  problems  and  comments  to 
waldemarf hx . Ics .mit . edu . 

Copyright  1989.  1990.  1991  Waloemar  horwat 


INCLUDE  "Cosmos,  i** 

.'Each  routine  and  section  of  code  has  an  attribute  called  criticality.  A  criticality  is 
;a  number  between  0  and  7  with  the  following  meaningsr 

; 0 :  All  operations  allowed.  Caller's  registers  not  preserved. 

; 1 :  Caller's  registers  preserved.  May  suspend,  so  caller's  qlooals  are  not  preserved. 

:2:  No  suspending  faults,  no  modification  of  context  state. 

;3:  No  suspending  faults,  no  modification  of  context  state,  no  OD)ect  migration. 

;4:  No  message  sends,  no  object  migration. 

;S:  No  heap  compaction,  no  message  sends. 

No  faults,  no  heap  compaction,  no  message  sends. 

;7:  No  priority  1  interrupts,  no  faults,  no  heap  compaction,  no  message  senos. 

;A  routine's  criticality  is  no  greater  than  tne  criticality  of  any  of  its  components,  subroutines, 

;or  fault  handlers.  The  criticality  of  a  fault  handler  can  be  no  greater  than  S. 

.'Each  fault  handler's  code  starts  at  criticality  6  unti«  tne  state  in  the  fault  registers  is  saved. 
.'Dereferencing  an  address  register  (other  than  AO  in  absolute  mooe>  has  criticality  at  most  i  unless 
:the  code  has  criticality  5  and  the  address  register  is  known  to  be  valid. 

;A  routine  t.ut  uses  a  global  must  have  criticality  at  least  2. 

.‘Faults  that  save  registers  should  start  in  unchecKed  mode  because  a  register  might  contain  a  CFUTure. 


90. .NNodes*! 


MODULE 

ENTRY 

ORC 

DC 

BR 

ALL 

$400 

InitialireMOP-c 

RO 

•♦2) 

.•Reset  IPB  locat 
.'Co  initialize: 

ORG 

BRATEnd 

OSStart : 
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#««## 

t«  • 

*1  Fault  Handlers  • 

«l  « 


I  Crash  on  a  general  priority  '  or  1  or  undefined  systeic  call  fault. 


CrashO: 
Crashl : 
CrashCall : 


HALT  haltraultO 
HALT  haltFaultl 
HALT  haltCall 


fltCrashO  «  IP :  abs  I  fault  I  unchecked  I  CrashO«offsetN 
fltCrashi  *  IP :  abs  I  fault  I  unchecked  I  Craanl«offsetN 
fltCrashCall  •  IP  :  aba  1  fault  |  unchecked  I  CrashCall«of  f  setN 


I  Crash  on  a  general  future  or  type  fault. 


CrashFuture:  HALT  haltFuture 

CrashType:  HALT  haltType 

f ItCrashFuture  -  TP : abs l fauit | unchecked i CrashFuture<<offsetK 
fltCrashType  -  IP : abs I  fault i unchecked tCrashType<<of ( setN 


I  Handle  the 

early  or 

send  fault  by  re-trying 

the  operation. 

(Criticality 

I 

5. 

RetryHandler : 

MOVE 

RO,  FOPO 

.-save  RO.  Criticality  6. 

HOVE 

FIP, RO 

.-Back  up  FIP 

by  one  instruction. 

ROT 

RO,  -phaseK, RO 

SUB 

RO,  1,R0 

ROT 

RO,  phaseN, P  2 

MOVE 

RO, FIP 

MOVE 

FOPO, RO 

.-Restore  RO . 

MOVE 

FIP,  IP 

fltEarly  -  IP 

:  abs 1  fault ( unchecked ( Ret  ryHandler<<of  f  setN 

fltSend  -  IP; 

abs 1  fault  1  unchecked i Ret ryHandler<<of f setH 

;  I  Handle  a  limit  fault 

Halt  unless  LimitOverr ice  was  set. 

in  wnich  case  clear  it  and 

;(  jump  to  the  overr.de 

routine.  RC  and  R1  are 

altered  when 

LimitOverride  is  used. 

I 

(Criticality 

I 

S . 

LimitHandler : 

MOVE 

( Limit Over ride,  AO ] ,  RO 

.-Criticality 

6. 

BNIL 

RO, "Limit  Halt 

.'Halt  unless 

LimitOverride  was  set. 

MOVE 

NIL.  R1 

.-Clear  LimitOverride. 

MOVE 

Rl,  ( LimitOverr ide, AO ; 

MOVE 

RO,  TP 

;Co  to  the  override  routine. 

Limit^Halt : 

HALT 

haltLimlt 

(ItLlmit  -  IP 

;absifault( unchecked ! Limi t Hand 1 er < <of  f setN 

;  (  Handle  a  CFUTure  fau 

It . 

I 

1 Cr it icai i ty  1 . 

I 


1  Save  the  current  state  in  the  context,  suspeno,  ano  allocate  a  r;ew  Tast  context  for 
I  the  next  message.  Various  entry  points  are  provioed  depending  on  how  much  savp  has 
I  to  be  saved.  These  routines  do  not  return. 


I 

lEntry:  SaveStateID023  <100,  102,  IC3,  and  the  message.  If  any,  have  to  be  saved.) 
lEr.try:  SaveStatelDOl  (100,  103.  and  the  message,  if  any,  have  to  Oe  saved.) 
(Entry:  SaveState  (No  registers  have  to  be  saved.) 

(Unchecked  absolute  non-fault  mode  required. 


CFUT_HaU  : 
CFUTHandler ; 


SaveStatelOO?3 : 


HALT 

haitUninitVar 

MOVE 

RO,  1  context R 0, A 1 ! 

MOVE 

fl 1 , 1 cont ext Rl , AI ; 

MOVE 

R2, (contextR?, Al ) 

MOVE 

B3,  i  contextRJ, Al ] 

MOVE 

FOPC, R3 

IF  DEBUG 

GT 

R3, C, RC 

BF 

RO, 'CFUT  Ha  It 

END 

MOVE 

B3,  i cont ext  Next , A 

MOVE 

FIP, HI 

MOVE. 

Rl  ,F 

ROT 

R 1 , -pha  seN, R 1 

SUB 

R  1 ,  5 ,  R  : 

ROT 

R 1 ,  phaseN,  R 

MOVE 

R 1 ,  i context  I P, Al I 

MOVE 

ID2,RC 

MOVE. 

RO,  i context  I D2, Al 

;Ar:  uni  r.  it  I  a  1  i  red  variable  was  referenced. 
.'Save  RO  and  HI.  Criticality  6. 


.-Halt  if  ar.  ur.  in  1 1 ;  a  ,  i  zed  variable  was  referenced. 

.•Marx  ..p  h  :y  ny  ent*  i nst  r .,ct  i on  . 

. Cr . t : ca, ity  2  - 


.-Save  IP,  K?.  R3,  IOC,  102,  and  IP’  ir  the  context. 
.Save  IDS,  102,  and  103  in  the  '-jrrent  context. 
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SaveSt.AteID03 : 


SaveState^Msq : 
SaveState : 


MOVE 

MOVE 

MOVE 

BF 

MOVE 

SUB 

AND 

BR 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

HOVE 

MOVE 

MOVE 

MOVE 

MOVE 

HOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

BNIL 

XLATE 

MOVE 

MOVE 

SUSPEND 


IDO, RO 

RO,  ( context  I  DO, A1 I 
Q,  RO 

RO, ''SaveState  Msg 
U.RO 

RO,  (0,A3],R0 
RO, lengthM,RO 
RO 

[  1  b,  A3)  ,R0 
RO,  1  lb, A1 1 
[14,A3],R0 
RO,  [  U,A1  ) 

[13,A31,R0 

no,  ( 13, AI I 

[12.A3),R0 
RO, (12, Al) 

111, A31,R0 
RO, 

(10,A3),R0 
RO, [ i:, Al ] 

(9,A31,R0 
RO, [9, Al 1 

(8, A31.ro 
RO, [e,Al] 

{?,  A3I,R0 
RO, 17,A1) 

(6.A3],R0 
RO, [6. All 
IS,  A3],R0 
RO, (5, All 
[4,A3],R0 
RO,  14, Al 1 
(3.A31.R0 
RO. [3, All 
12, A3] ,R0 
RO,  (2,  Al  I 
IDl , RO 
RO,  1D3 
:d3.ro 

RO, 1  context ID3, Al ) 

{ FastContextOueue, AOl ,  RO 
RO,  “AHocFastContext 
RO, ObjectXLATE,  Al 
(cent  ext Next , Al I , RO 
RO, (FaatContextQueue,  AO  I 


.‘Save  IDC  and  ID3  in  tne  Cc,rrent  context. 

.-Check  whether  the  message  should  be  copied  into  me  context. 
.‘Don*t  copy  If  A3  didn't  point  into  the  queue. 

.‘Copy  the  message  Into  the  context 

.‘Jump  into  the  appropriate  place  in  the  copy  code. 


.‘Allocate  a  new  fast  cor.text- 
.‘There  are  no  more. 

.•unlink  it . 

.‘Criticality  1. 


I  Allocate  and  initiali.:e  a  new  fast  context 
)  routine  does  not  return. 


I 

I  Entry:  AllocFast Cent ext 

I 

lUnchecked  absolute  non*fault  mode  required. 

I 


AllocFaStContext :  DC 
CALL 
HOVE 
XLATE 
SUSPEND 


OBJ :hdr Locked i contexts 

allocNextObject 

102, R1 

Rl,objectXLATE,Al 


to  DO  used  oy  the  next  message.  This  i 


I 


.'Create  the  context  object. 

.'Point  Al  and  IDl  to  the  new  context. 


I  Suspend;  if 

I  This  routine 

slow  context  was  used,  deallocate  It  and  rep..ace  it  with  a  fast  one.  « 

does  not  return.  > 

iCall;  suspend 

lln:  AlDl 

Context 

iCriticality  0 

uspend: 

MOVE 

(con text  Header, Al ] , RC 

•  Cr 1 1 i cal  3 ly  3. 

ROT 

RO, -hdrFastN, RC 

.‘Check  whether  this  was  a  fast  context. 

BT 

RC,  ''Suspend  Fast 

;  Yes  - 

MOVE 

101 , RO 

;No.  OisDcsc  this  context  and  allocate  a 

CALL 

di sposeObject 

BR 

“SaveState 

Suspend_Fast :  SUSPEND 


f ItCFUT  -  IP:abslfaulti unchecked i CFUTHand ler<<of f set N 
fltSuspend  •  IP-abs:  'unchecked  i  Suspend<<of  f  setN 


I  Handle  an  INVADR  fault.  If  tne  object  is  on  this  nooe,  store  its  address  in  the 
1  address  register,  if  it  is  not  on  this  node,  bring  it  here. 


.‘ICriticality 

:  1 

1 . 

INVADRHandler ; 

MOVE 

R:  ,  (TempXiATE  R1 ,  AO 

MOVE 

FIR.RI 

ASD 

HI  ,  3, ftl 

LSH 

R 1 ,  2 ,  R 

BN  2 

RI  ,  HI 

MOVE 

IDO, R1 

PROBE 

R 1  .  R 1 

BNIL. 

RI , * INVADR  MissO 

MOVE 

R1  ,  AO 

MOVE 

F1R,R1 

BNIL 

Rl, 'INVADR  Rstrt? 

.'Save  R;.  Cri  t.culiiy  6. 

.‘FIR  contains  tne  correct  address  register  number 
.even  wner.  KIM»n:l. 

;  Cc!  to  orjp  of  fo.;r  narid.ers 

jChecx  the  x.a'.c  caufie  first. 

;Ji;rrp  into  the  objectXiJtTi  handier  if  missed. 


;If  the  FIR  was  Ni;..  don't  bars  up  the  FIP. 
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IKVADR  Miss: 


INVADR  Mlss2: 


INVADR  Restart : 


ZKVADR  Rstrt2: 


MOVE 

FIP.Rl 

BR 

'IMVAOR_Re*tart 

HALT 

haltlnvalidAl 

MOVE 

RO, i TempXLATE  R0,A0] 

.'Save  RO,  R2,  and  FIP. 

MOVE 

FIP,R0 

MOVE 

RO. [TempXLATE  FZP.AO) 

MOVE 

R2, (TempXLATE  R2, AO) 

MOVE 

R0,F 

.'Criticality  S. 

BR 

'XLATE  ToObject 

.‘Jump  into 

tne  ObjectXLATE  handier 

MOVE 

102. Rl 

PROBE 

Rl ,  Rl 

.-Check  the 

xlate  cache  first. 

BNIL 

Rl, 'INVAOR  Hiss2 

.'Jump  into 

the  ObjectXLATE  handler 

MOVE 

R 1 .  A2 

MOVE 

FIP.Rl 

BR 

'INVAOR  Restart 

MOVE 

ID2,R1 

BR 

'INVAOR  Miss 

MOVE 

103, Rl 

PROBE 

R1,R1 

.'Check  the 

xlate  cache  first. 

BNIL 

Rl, 'INVAOR  Miss3 

.‘Jump  into 

the  ObjectXLATE  handler 

MOVE 

Rl,  A3 

MOVE 

F1P,R1 

ROT 

Rl , -phaseN,  R! 

.‘Restart  the  instruction. 

SUB 

Rl,  l.Rl 

ROT 

Rl , phaseN, Rl 

MOVE 

R1,FIP 

MOVE 

[TempXLATE  R1,A0],R1 

MOVE 

FIP, IP 

if  missed. 

if  missed. 


if  missed. 


INVAOR_MissO: 

MOVE 

FIR, Rl 

BNNIL 

Rl, 'INVAOR  HO  2 

MOVE 

FIP, Rl 

ROT 

Rl . -phaseN, Rl 

ADO 

Rl.  l.Rl 

ROT 

Rl , phaseN, Rl 

MOVE 

Rl , FIP 

INVADR_M0_2 : 

MOVE 

IDO. Rl 

BR 

'INVADR  Miss 

IKVA0R_Miss3 : 

MOVE 

ID3,R1 

BR 

'INVADR  Miss 

.'Advance  the  FIP  if  the  FIR  was  NIL. 


I  Handle  an  XLATC  fault. 

I  Two  bits  of  the  instruction  are  used  to  determine  what  to  do.  The  possible  actions 
I  are : 

\  objectXLATE:  Return  an  ADDR  containing  the  object's  address.  If  the  object  is  not 
I  on  this  node,  bring  it  here.  Rs  must  be  an  ic,  a  DID,  a  class,  or  a  selector. 

I  localXLATE;  If  Rs  represents  an  object  on  this  node,  return  its  address;  if  Rs  is 
I  a  constant,  return  NIL;  otherwise,  return  the  number  of  a  node  lisely  to 

I  contain  the  object.  This  mode  can  be  used  only  when  Rd  is  a  data  register.  Rs 

I  must  be  an  10,  DIO,  a  class,  a  selector,  or  a  constant. 

I  internalXtATE;  Same  as  localXLATE  except  that  treats  futures  as  if  they  were 
I  objects  instead  of  constants. 

I  restoreXLATE:  Invalidate  Rd  by  storing  an  invalid  aooress  tnere.  OI  course,  if 
I  the  XLATC  Cable  hies,  the  value  associated  wicn  Rs  is  stereo  in  Rd  instead. 

I  XLATE  should  be  made  to  fault  on  Futures  or  CFUTures;  mis  can  oe  accomplished  by 
I  calling  XLATC  in  checi^ed  mode. 


I  The  criticalities  are  as  follows; 

I  ObjectXLATE;  Criticality  1  (criticality  S  if  the  ODjeui  is  xnown  to  reside  or.  this  hodei. 
I  InternalXLATE;  Criticality  i. 
i  iocaiXLATE:  Criticality  i. 

I  restoreXLATE;  Criticality  S. 


XLATEHandler :  MOVE 

MOVE 
MOVE 
MOVE 
MOVE 
MOVE 
MOVE 
MOVE 
ROT 
AND 
AND 
BR 

:  BK 

BR 
BR 
DC 

XLATE_ResuUr  ROT 
BR 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

BR 

DC 

MOVE 

BH 


BR 

DC 

MOVE 

BH 

DC 


RO, [TempXLATE_RO,AOi 
Rl, (TempXLATE_Rl,A01 
R2, (TempXLATE_R2, AO; 
FIP, RO 

RO, [TempXLATE  FIP, AO) 

FIR,R2 

FOPl,Rl 

RO,  F 

R2,  -9.  RO 
R2, 7, R2 
R0,3,R0 
RO 

'XLATE  ToObject 
"XLATE'interna; 
“XLATE^Local 
ADDR ; rel I  invalid 
R2, i,R2 
R2 

lTempXLATE_R2, AOj ,R2 
[  Terr.pXLATE_R  1 ,  AO  J ,  R ! 
[TempXLATE  FIP,  AC:  ,  IP 
R0,RJ 

ITempXLATE  R?,A01,R2 
i TempXLATE" RO.  AO) ,  RC 
lTempXlJSTF.>IP,AC;,  IP 
RO,  R2 

[TempXLATE  Rl , AO  j  ,  RI 
iTcmpXLATE  H0,A0i,RC 
ITpmpXLATEFIP.AO;, !P 
R0.R3 

'XLATE  R  Derc* 

0 

RO,  AO 

'X1>TF.  R  Loru* 

C 

HO.  a: 

'X:.ATE  H  Done 

c 

RC,  A? 

'XLATE  H  Dorr 


.Save  PO.  Rl.  H2,  FIP,  and  FIR.  Criticality  6. 


.•Criticality  S  . 

.'Save  the  oestlr.atior  addressing  mode  in  R2 . 


Get  x.ne  object  . 

Co  to  the  interna,  codc. 

Co  to  the  .oca*  code. 

RestoreXLATE:  Invalidate  the  address  register. 

Store  RO  in  the  destination  of  the  XLATE.  Rl  contains  the 
value  of  Rs  and  is  stored  in  the  ID  register.  R2  contains  the 
addressing  mode  from  the  XLATE  instruction. 
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MOVE 

MOVE 

XLATE_R  Don*:  MOVE 

MOVE 
MOVE 
MOVE 


RO,  A3 
Rl. ID3 

[TempXLATE  R2,A0],R2 
|T*mpXLATE~Rl,  AOl  ,R1 
[TempXLATE  R0,AQ),R0 
[TempXLATE'FIP.AOl. IP 


XLATE_L_TAGO:  AND 

SUB 
BZ 
SUB 
BZ 
BR 


RO, subtaqM, RO  ;lf  th«  v*lu*  it  a  class,  pretend  it  Is  an  ID. 

RO.subCLASS.RO 

R0,'‘XLATE_I  SC 

RO. SUbSEL-SUDCLASS.  RO 

RO.  '‘XLATE_I_SC 

“XLATt  L  NIL 


XLATE_ToObject :  CHECK 
BR 


RI,  ID.RO 
'‘XLATE_Obj*ct 


XLATE_Internal :  CHECK 
BT 

XLATELocal:  RTAG 

BR 
ROT 
BR 

XLATE_L_NIL:  MOVE 

BR 

:  MOVE 

BR 

HALT 

HALT 

HALT 

HALT 

HALT 

XLATE  I_SC:  MOVE 

BR 

MOVE 

BR 

:  MOVE 

BR 

MOVE 

BR 

HALT 

:  HALT 

:  HALT 

XLATE  Halt:  HALT 

XLATE"L  DID:  MOVE 

CALL 
PROBE 
BNIL 
ENTER 
BR 

XLATE  L  :0:  CALL 

BNIL 
CHECK 
BT 

CHECK 

BF 

MOVE 

ROT 

AND 

MOVE 

OR 

XOR 

MOVE 

MOVE 

ENTER 

BR 

XLATE  L  Miss:  HOVE 

AND 
MOVE 
EQUAL 
BT 

XLATE_L  2:  MOVE 

BR 


R1,FUT,R0 
RO, *XLATE_I_SC 
Ri  ,  RO 
RO 

Rl , -subtagN, RO 
“XLATE  L  TACO 
NIL,R0“ 

•XLATE_Re*uit 

NIL.RO 

“XLATE  Result 

naltXLATE 

haltXLATE 

haltXLATE 

haltXLATE 

haltXLATE 

R2, (TempXLATE_FIR, AOJ 
“XLATE  L_ID 
R2, (TempXLATEFlR.AO; 
“XLATE  L_DID 
NIL,  RO' 

“XLATE_Result 
NIL,  RO 

“XLATE_Re*uU 

haltXLATE 

haltXLATE 

haltXLATE 

haltXLATE 

R1.B2 

prefer redCon St i tuent 
R1,R0 

RO,  “XLATE  L_10 
R2,R0 

“XLATE_L_2 

lookupBlndinq 

RO, “XLATE  L  Miss 

RO, INT,R2 

R2,  “XLATE  L  2 

BQ,AD0R.R5 

R2,  “XUT£^L_Cxt 

RO, ITerapXLATE  Temp.AOi 

R0.*baseN,R2 

R2, baseM, R2 

|R2,AC] ,R0 

RO, hdrMarXed, RO 

RO, hdrMarKed, RO 

RO, [R2.A0] 

(TempXLATE  Temp,ACj,RO 
Rl.RO 

“XLATE_L^2 

( NodeMask, AO  j , RO 

RO,B1,RO 

NNR, R2 

RO. R2. R2 

R2, 'XLATE  Halt 

(TempXLATEFIR, A0),R2 

“XLATE  Result 


: XLAT£_Intcrnal  is  the  same  at  XLATE_Local  fer  value*  which 
;aren*t  futures. 

.-Dispatch  on  the  taq  of  the  object. 

;  TACO 
;  INT 


;ADOR 
;  IP 

;MSG  /  OBJ 
;  CFUT 
;FUT. 

;10.  Save  the  FIR. 

.-DID.  Save  the  FIR  - 

;  TACA 

; F  LOAT 

; INSTO 
; INSTl 
; 1 NST2 
; INST3 

.'Save  the  DID. 

.-Get  an  ID  from  the  DID. 

.-Check  If  the  constituent  ID  is  in  the  cache. 

;lf  so,  enter  and  return  it. 

.-Look  for  a  binding  of  the  object  on 
.-this  node. 

:If  an  integer  was  found,  it  is  the  object's  current 
; node  number . 


.'Save  RO. 


:Fetch  the  object's  header  ano  clear  the  marked  flag  in  it. 


.'Restore  RO. 

.-Found  such  a  binding.  Enter  it  in  the  XLATE  table. 

;Dio  not  tind  a  binding.  Extract  me  node  number  from 
:the  ID  and  go  return  it. 

.-If  the  node  number  is  this  noce,  halt  because  this  is 
.-supposed  to  be  tne  home  node,  yet  it  doesn't  know  where  the 
.'Object  is. 

;Go  return  the  result  in  RC. 


XLATE  L  Cxt  -. 


CHECK 

RO, ID,R2 

;Tne  ID  is  bound  to  a  context.  Dereference  tne  context 

BF 

R2, “XLATE  Halt 

.-reaa  its  contextNext  field. 

MOVE 

Rl , [TempXLATE  Temp.AOi 

.-Save  the  object  ID. 

MOVE 

RO,  Rl 

CALL 

lookupBlnding 

MOVE 

(TempXLATE  Temp, AO), Rl 

.-Restore  the  object  ID. 

CHECK 

RO. ADDR, R2‘ 

BF 

R2, -XLATE  Halt 

ROT 

RO, 'baseN7R2 

AND 

R2, baseM,  R2 

MOVE 

context  Next ,  RO 

ADD 

R2,B0,R2 

MOVE 

[R2,A0i,RC 

BNIL 

RC. “XLATE  L  Miss 

.’Miss  if  it  was  NIL. 

CHECK 

RO, INT, R2' 

;lf  an  integer  was  founc,  it  is  the  object's  current 

BF 

R2, “XLATE  L  Cxt 

;noae  nunocr;  otherwise  t.here  is  another  context  linked 

BR 

“XLATE  L  2 

XLATE_0_Acce8S :  MOVE 
ROT 
AND 
HOVE 
OR 
XOR 
MOVE 
MOVE 
ENTER 
MOVE 

XLATE_0_Rebind :  DC 

MOVE 

SUB 

MOVE 

MOVE 

MOVE 


RO,  [ TempXLATE_Temp,  AO ; 

RO, 'baseN, R2 

R2, baseM, R2 

[R2,A0i,R0 

HO,  hdrMa  rked,  RC 

RO, hdrMarked, RO 

RO,  1R2,A0' 

[TempXLATE  Temp,AOi,HC 
R1,R0 

(TempXLATE  R2, AO) , R? 
phase 

(TempXLATE  flP,A0!,R; 
ri,rc,r: 

Rl, [TempXLATE  F1P,AC, 
[TempXLATE  RirA0],R) 
ITempXIATE'  aC,  AC  ,  HC 


;  Save 


RO  . 


.-Eetch  the  object's  header  and  clear  the  marxed  flag 


; Restore 
: t ouno  a 
; restart 


RO. 

binding, 
the  XLATF 


Enter  it  in  the  XLATE  table  a.'id 
lor  aaoress-'faulted)  instruction. 


it  . 
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Appendix  F 


Cosmos  Listing 


Cosmos.m 


XLATE  O  2: 


XLATE_H*:t_2: 
XLATE_Object : 


XIATE_0_2 : 
XLATE  O  ID: 


XLATE  O  Point : 


XLATE  0_Hiss: 


XLATE  0  Fetch: 


XLATE^Suspend : 
XLATE_0_Append 

fltlNVADR  -  IP 
fltXLATE  -  IP- 


MOVE  : TempXLATE_F 1 P . AO  1 , IP 


IF  DEBUG 


CHECK 

R1,TAG0,R0 

BF 

RO, 'XLATE  Halt  2 

ROT 

Rl,‘-subtagN,RO 

AND 

RO, subtagM, RO 

SUB 

R0,aubCLASS,R0 

BZ 

RO, 'XLATE  0  ID 

SUB 

RO, subSEL-SubCLASS,  RO 

BZ 

RO, 'XLATE_0_1D 

END 

HALT 

haltXLATE 

BT 

RO, 'XLATE  0  ID 

CHECK 

R1,D1D,R0 

BF 

RO. 'XLATE  0  2 

MOVE 

R1,R2 

CALL 

preferredConst ituent 

PROBE 

RI,  RO 

BNIL 

RO, 'XLATE  0  ID 

ENTER 

R2,R0 

MOVE 

[TempXLATE  R2,AQI.R2 

BR 

'XLATE  O  Rebind 

IF  'DEBUG 

END 

CALL 

lookupBinding 

BNIL 

RO, 'XLATE_0_Miss 

HOVE 

R2, (TempXLATE  Temp, AO) 

CHECK 

RO, ADOR, R2 

BT 

R2, 'XLATE  0  Access 

CHECK 

RO, INT, R2 

BT 

R2, 'XLATE_0_Point 

MOVE 

i TempXLATE  Temp,AD),R2 

BR 

“XLATE  O  Fetch 

MOVE 

[TempXLATE  Temp,ACi,R? 

SENDO 

RO 

DC 

MSG :msqPeqjestODject»3 

SENDO 

RO 

MOVE 

NNR, RC 

SEND2E0 

Rl.  RO 

BR 

-XLATE_0_FetCh 

DC 

MSG :msqRequestOb ject  *3 

AND 

Rl, [NodeHask, AOi , R2 

SEND20 

R2,R0 

MOVE 

NNR, RO 

SEN02EO 

R1,R0 

R0,R2.R2 

R2, 'XLATE  Hail  2 
NIL,  R2 

[TempXLATE  RO,A01,RO 
RC,  IcontextRO, AI j 
I TerapXLATE^Ri ,  AO ) ,  RO 
RO, IcontextRl, A] ) 

I TempXLATE_R2. AC  J , RC 
RO, |contextR2, Al ) 

! TempXLATE_FI P,  AO ) ,  RO 

RO, •phaseN7RQ 

RO,  l.RO 

RO, phaseN,  RO 

RC, (eontextIP, a: • 

r3, (contextRi, A1 ! 

ID2,R0 

RO, |contextID2,Al) 
ID1,R0 

R2, 'XLATE_0_Append 
R2, (contextNext , Al 1 
enterBindinq 
SaveStateID03-{**2> 

RO 

IR2,ACJ,R3 
R3, IcontextNext, Al ) 
RO,  [R2, AOJ 
“XLATE_Su8peno 


EQUAL 

BT 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

ROT 

SUB 

ROT 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

BNNIL 

MOVE 

CALL 

DC 

BR 

MOVE 

MOVE 

MOVE 

BR 


;C2a««es  and  selectors  are  also  objects  and  are 
.‘treated  as  if  they  ere  ID's. 


.-Dispatch  on  the  taq  of  the  object. 


.‘Save  the  DID. 

.'Get  an  ID  frofe  the  DID. 

.‘Check  if  the  constituent  ID  is  in  the  cache. 
.‘If  so,  enter  and  return  it. 


.‘Look  tor  a  binding  of  the  object. 

:Save  the  binding's  address. 

.‘If  the  object's  address  was  found,  return  it. 

.‘If  the  object's  current  node  was  found,  send  a  requestObject 
.-message  there. 

.-Otherwise  someone  is  already  waiting  for  the  object. 

.-Append  this  context  to  the  waiting  queue. 

.-Send  a  message  requesting  tne  object  to  the  object's 
.-curren’v  locatior. 


.-Seno  a  message  requesting  the  object  to  the  object's 
; home . 


.-However,  if  this  nooe  is  supposed  to  be  the  object's  home, 
.-halt  because  tne  object  doesn't  app>ear  to  exist. 

:R2  being  NIL  means  no  one  else  is  waiting  for  the  object. 
.-Save  state  in  the  context. 


.-Criticality  3. 

.-Back  up  IP  to  point  to  the  XLATE  ifjst ruct  ion . 
.-Save  IP,  R0-fl3,  !DC,  I02,  ano  ID3  in  the  context. 


;Maxe  a  binding  indicating  that  the  context  in  IDI  is 
.-waiting  for  the  object  in  R1 . 

.-Save  tne  rest  of  the  state  and  suspend. 

.-Append  the  binding  to  the  linked  list  headed  by  R2. 


:  abs ( f ault I  unchecked i I NVADRHandJer <<of  f seiN 
lbs ! fault  I  unchecked ! XLATEHandler<<of  f  setN 
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i«  #« 
tf  Heap  Manager  II 
II  II 


I  Copy  the  object  pointed  by  A3  into  the  object  pointed  by  A2-  The  copy  stop«  aa  soon 
I  at  a  limit  fault  is  reached.  A2  and  A3  are  guaranteed  not  to  be  XLATEa,  so  they  do 
I  not  have  to  correspond  to  the  values  in  the  ID  registers. 

I  The  objects  are  copied  from  the  bottom  up,  so,  if  they  overlap,  the  destination  must 
I  start  before  the  source. 

I  RO  can  be  a  number  smaller  than  32  indicating  the  offset  of  the  first  wero  in  each 
I  object  which  should  be  copied;  words  with  indices  smaller  than  RO  are  not  copied. 


(Call:  blocicMove 

lln:  RO  Offset  of  first  word  to  copy. 

I  A2  Destination  object  pointer. 

I  A3  Source  object  pointer. 

icritlcality  5. 

^Alters  RO/Rl . 


BlockMove : 


BM  MoveRest : 


MOVE 

FIP,R1 

MOVE 

Rl.  [ LimitOverride, AO  1 

MOVE 

R1,F 

BR 

RO 

MOVE 

(Q.A3],R0 

HOVE 

RO.  {0,A2) 

MOVE 

1 1. A3],R0 

MOVE 

RO.  Il,A2i 

HOVE 

|2.A3),R0 

MOVE 

RO. I2,A21 

MOVE 

(3.A3].R0 

MOVE 

RO,  13, A2} 

MOVE 

[l,A31,R0 

MOVE 

RO.  H.A21 

MOVE 

[S, A31.ro 

MOVE 

RO,  (5,A2] 

MOVE 

I€,A31,R0 

MOVE 

RO, [6,A21 

MOVE 

(7,A3J,R0 

HOVE 

RO, [7,A2] 

MOVE 

(8,A31.R0 

MOVE 

RO, i«,A2: 

MOVE 

(9, A3; .RO 

HOVE 

RO,  (9,A2J 

MOVE 

(10, A3! , RO 

MOVE 

RO, (10,A21 

MOVE 

[ll,A3l,R0 

MOVE 

RO, [11, A2I 

MOVE 

(J2,A3J.R0 

MOVE 

RO, 112, A2> 

MOVE 

(13,A3l  ,R0 

MOVE 

RO, [13,A21 

MOVE 

IU,A31,R0 

MOVE 

RO,  114. A2) 

MOVE 

(15,  A3l,R0 

MOVE 

RO,  (ib,A23 

MOVE 

(I6.A3I,R0 

MOVE 

RO,  116,A2i 

MOVE 

[17,A31,R0 

MOVE 

RO, 117,A2I 

MOVE 

[18, A3] ,R0 

MOVE 

RO,  (18, A2) 

MOVE 

1 19,  A3]  ,R0 

MOVE 

RO. (19, A2) 

MOVE 

(20,A3!,R0 

MOVE 

RO, (20, A2) 

MOVE 

(23,A3),flO 

MOVE 

RO, (21, A2] 

MOVE 

(22, A3) , BO 

MOVE 

RO. [22, A2I 

MOVE 

(23,A3),R0 

MOVE 

RO, (23, A2J 

MOVE 

(24,A3),R0 

MOVE 

RO,  [24, A2] 

MOVE 

(25,A3i,R0 

MOVE 

RO, 125,A21 

MOVE 

[26, A3] ,R0 

MOVE 

RO, (26, A2) 

MOVE 

1?7,a3],R0 

MOVE 

RO,  :27,A2i 

MOVE 

i2e,A3;,R0 

MOVE 

RO, [28, A2! 

MOVE 

i29,A3),R0 

MOVE 

RC,  [29,  A2; 

MOVE 

(30,A3:,RC 

MOVE 

BO. (30, A2' 

MOVE 

[ 3i , A3 ) , HO 

MOVE 

RC,  :3'.  ,A2 

MOVE 

32,  Rl 

MOVE 

(r:,a3],ro 

MOVE 

RC, IB1,A?: 

ADD 

R 1 ,  1 ,  R  1 

MOVE 

(Rl , A3i , RO 

MOVE 

RC, IRl,A2i 

ADD 

Rl,  1,R1 

MOVE 

(Rl,A3l,R0 

MOVE 

RC, 1R1,A2, 

ADD 

Rl . 1 , Rl 

MOVE 

[Rl , A3 ; , BO 

MOVE 

RO, (Rl. A2] 

BR 

“BM  MoveRest 

.‘Criticality  6. 

.‘Override  the  Limit  fault  for  the  duration  of  this  routine. 
.‘Criticality  S. 


;Mcve  the  rest  of  tne  GL;,ect. 
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Cosmos  Listing 


Cosmos.m 


f ItBIockMove  >  IPiabsifaulti unchecked  I &lockMove<<ol 2  setN 


I  Send  the  oDjecc  pointed  by  A2 .  The  send  stops  ss  soon  ss  s  limit  fault  is  reached. 

I  A2  is  quaranteed  not  to  be  XLATEd,  so  it  does  not  have  to  correspond  to  the  value  in 
I  1D2.  RO  should  be  one  of  the  follovinq: 

I  0:  Words  are  sent  starting  from  offset  1  in  the  object. 

1  1:  Words  are  sent  starting  from  offset  3  in  the  object. 

[  2:  Words  are  sent  starting  from  offset  i  in  the  object. 


i 

■Call:  blocksend 
I 

I  In:  RO  Encoded  offset  of  first  word  to  sena. 

I  A2  Source  object  pointer. 

I 

ICriticality  S. 

I 

lAlters  R0/R1/A2. 

I 


BlockSend : 


BS  SendRest; 


MOVE  FIP,R1 

HOVE  Rl, (LimitOverride^AO] 
MOVE  R1,F 

BR  RO 

SENDO  (1,A21 

SENDO  (2.A2] 

SENDO  |3,A21 

SENDO  I4,A2l 

SENDO  {S,A2| 

SENDO  16,A2] 

SENDO  17,A2) 

SENOG  (e.A2] 

SENDO  (9,A2) 

SENDO  [10. A2] 

SENDO  [11, A2} 

SENDO  (12, A2] 

SENDO  [13,A2i 

SENDO  (14, A2} 

SENDO  I1S,A2] 

SENDO  IU,A2) 

SENDO  |n,A2} 

SENDO  (18, A2} 

SFNDO  [19,A21 

SENDO  (20, A2) 

SENDO  (21,A21 

SENDO  [22.A21 

SENDO  [23,A2| 

SENDO  [24, A2] 

SENDO  [2S,A2] 

SENDO  [26,A2) 

SENDO  [27,A2J 

SENDO  [28, A2) 

SENDO  (29,A21 

SENDO  {30,A2I 

SENDO  [31, A2] 

MOVE  32,  BO 

SENDO  IR0,A2) 

ADD  RO,1,RO 

SENDO  [R0,A2) 

ADD  RC,1,R0 

SENDO  (R0,A2) 

ADO  R0,l,RO 

SENDO  [RC,A2: 

ADD  RO.l.RO 

BB  “BS  SendRest 


.•Criticality  S. 

.-Override  the  Limit  fault  for  the  duration  of  thla  routine. 
;Criticality  S. 


.•Sends  more  words  of  the  object. 


fltBlockSend  -  IP ; abs I f auit i unchecnea  B:octSena<<oI I  sets 


Compact  the  node's  heap,  trying  to  free  at  least  R',  words  of  memory.  Halt  if  this 
much  memory  is  not  available. 

Call:  compactHeap 

In:  RO  Number  of  words  neeoed . 

Criticality  3. 

Alters  R0/RI/AID2. 


CompactHeap:  1 NVAL 

MOVE 
MOVE 
MOVE 
MOVE 
MOVE 
MOVE 
MOVE 
BT 

MOVE 

CH_0;  MOVE 

MOVE 
MOVE 
MOVE 
MOVE 
MOVE 
DC 

MOVE 

MOVE 

DC 

MOVE 

CH  FiUShXlate;  ADD 


RO,  [TempCH  RC.  AO 
R2, [TempCHR?, A3 ‘ 

R3, (TempCH  R3,AC 
F1P,R0 

BO, [TempCH  FIP, AO; 

R0,F 
0,  RO 

RO, “CH  C 
ID3,RC' 

RO,  I  TempC«_  JC3,  AC 
A3,  RC 

RO, [TempCH  A3, ACi 
N1L,R3 

R3, 'TempCH_Locn, AO; 

R3,0 

I P  :  abs  1  uhcr,ec<ed  CH  2<<o 
RO, ( Liml tOver r ; ae, AO ! 


.'Criticality  Invalidate  all  relocatable  address  registers. 

.•Save  FIP,  RC.  R2,  R3,  Q,  A3,  and  ID3. 


: Cr it ica 1 ; ty  3 . 


; R3  will  contain  NIL  for  the  ouration  of  the  xiate  flush. 
.•Indicate  that  this  is  the  first  time  the  heap  Is  compacted. 
.'Disable  q'je.,e  wrapatourd. 
f  f sets 

.'Override  the  Limit  fajit  for  the  duration  of  this  rout’*'® 


.  R2 


ADDR  ;  ( XI  a;  eSt  a  <<ha  scN)  ♦X.at eEr.d-Xi  ateSt o  1 1 
R0,A2 

R2,2,H?  ;Checs  every  entry  in  the  XLATE  table  whether  It  contains 
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CH_2: 

CH_Compact2 : 


CH^Die: 

CH  Cc-''*ct : 


CH_Oone : 

CH^AlValid: 

CH_OQneO: 

CH_Don«2; 

CH^HarkLive: 

CH  Live: 


CH_A1  i<#nErrof ; 
CH_BRA7Error : 
fit  Compact Heap 


MOVE 

[R2, A2 ] . RO 

;a  reiocataoie  ADDR .  if  it  aoes,  replace  It  with  NIL. 

CHECK 

RO, ADDR/ R1 

BF 

Rl, *CH  FlushXlate 

AND 

RO. relTRl 

BZ 

Rl. ‘CH  FlushXlate 

MOVE 

R3, IR27a2) 

BR 

^CH_FlmhXlate 

MOVE 

[KeapStarc,A0],R2 

;R2  is  the  source  heap  scanner. 

HOVE 

R2.R3 

:R3  is  the  destination  heap  scanner. 

BR 

“CH  Compact 

HOVE 

1R2,A0),R0 

.-Get  the  next  object  at  the  source. 

ROT 

RO,  -hdrLockedN, R1 

,‘Let  it  live  if  it  is  locked. 

BT 

R1,*CH  Live 

ROT 

RO,-hdrOeIetedN,Rl 

.-Kill  It  if  it  is  deleted. 

BT 

Rl, “CH  Die 

ROT 

RO,-hdrHarkedN,Rl 

.-Purqe  It  if  is  is  marked. 

BF 

Rl , “CH  MatkLive 

ADD 

R2.  1,RT 

.-Read  the  object's  ID  Into  Rl . 

MOVE 

[R1,A0],R1 

CALL 

deleteBindlnq 

; No  need  to  purqe  the  xlate  table- 

MOVE 

(R2,A0],R0 

AND 

RO, hdrLenqthM, RO 

.‘Skip  the  source  heap  scanner  past  the  removed  object. 

ADO 

R2.R0,R2 

GE 

R2. (FlrstPree,A0),R0 

.-Check  whether  the  entire  neap  was  scanned. 

BF 

R0,“CH  Co»pact2 

EQUAL 

R2, (Firstrree,A0],R0 

:lf  so,  then  R2  must  match  FirstFree  exactly. 

BF 

RO, “CH  AlignError 

MOVE 

R3,  (FirstFree,  AQ) 

.-Update  FirstFree. 

MOVE 

[LastFree,A0],R0 

SUB 

R0.R3,R0 

.-Check  whether  there  is  how  enouqh  room  to  satisfy  the 

allocation 

GE 

RO, [TempCH  R0,A01,R0 

.-request . 

BT 

R0,“CH  Done 

: Leave  if  so. 

MOVE 

[TempCH  Lock,  AO  I.  RO 

MOVE 

TRUE, Rl 

;If  not,  compact  the  neap  aqain  unless  it  was  just  compacted. 

MOVE 

Rl, [TempCH  Lock,A0j 

BKIL 

R0,“CH  2 

HALT 

haltMemFull 

.-Give  up  if  two  successive  compactions  don't  free  enouqh 

■pace . 

MOVE 

A1 ,  RO 

.-Make  sure  that  A1  is  valid. 

ROT 

RO, -invalid, RO 

BF 

RO, “CH  AlValid 

MOVE 

IDi, RO" 

;If  not,  re-xiate  it. 

XLATE 

RO,  objectXLATE,  A: 

MOVE 

(TempCH  A3,A0),RC 

.Restore  A3,  Q.  and  1D3. 

MOVE 

an , 

MOVE 

[TempCH  ID3,A01,R0 

EQ 

RO, true7ri 

BT 

Rl , “CH  DoneQ 

MOVE 

RO, ID3" 

MOVE 

(TempCH  R2,A0},R2 

.-Restore  R2. 

BR 

“CH  0oni2 

MOVE 

R0,Q 

MOVE 

(TempCH  R2,A0),R2 

.-Restore  R2 . 

MOVE 

(TempCH"R3,A0J,R3 

.-Restore  R3  ano  return. 

MOVE 

(TempCH7FIP,  AOj,  ;p 

ROT 

BO,  -hdrPurgeableN,  B'. 

.-If  this  object  is  purqeab«e.  mark  it  so  that  it  will 

be 

purged 

BF 

R1,-CH  Live 

.-on  the  next  scan. 

OR 

RO, hdrMarked.  RO 

MOVE 

RO, (R2,A0I 

AND 

RO, hdrLenqthM,  RO 

.-Store  the  length  of  the  object  in  RO. 

ROT 

R3,baseN,Rl 

.-Point  a2  to  the  destination  object. 

ADD 

R1,B0,RI 

MOVE 

Rl,  A2 

ROT 

R2,baseN,Rl 

.-Point  A3  to  tne  source  ooject. 

ADD 

Rl,  RO, Rl 

ADD 

R2.  RO, R2 

.-Advance  the  source  and  destination  scanners. 

ADD 

R3,  RO,  B3 

EQUAL 

R2,R3,R0 

.-There  is  no  need  to  move  an  ooject  it  the  source  and 

destination 

BT 

RO, “CH  Compact 

.-addresses  are  the  same. 

MOVE 

Rl,  A3 

MOVE 

(objectID,A3),Rl 

.-Update  the  object's  binoirq  ir.  the  BRAT. 

MOVE 

R2, (TempCH  Src,A01 

.-  Save  R2  . 

CALL 

lookupBindInq 

CHECK 

Rw, ADDR. Rl 

.-Make  sure  tnai  tne  oinoinq  is  at-  ADDR. 

BF 

Rl, “CH  BRATError 

MOVE 

A?,  RO 

OR 

R0,rel,RC 

MOVE 

BO. (Rr, AO ! 

MOVE 

0,R0 

CALL 

blockMove 

.-Move  the  object  to  its  ccsiir.ation  location. 

MOVE 

(TempCH_Src,A0),R2 

; Restore  R? - 

BS  ■‘CH_Compact 

HALT  haliHeap 

HALT  haltBRATType 

-  IP;absifault!uncheckedi Compact heap<<of  f  setN 
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I  Allocate  and  initialize  a  new  heap  object.  HO  containe  the  word  to  be  stored  as  the  ) 
I  first  word  of  the  object.  The  lenqth  Is  extracted  from  ftO,  and  the  flags  in  the  i 
I  high  bits  of  RO  should  be  set  to  benign  values.  R)  contains  the  ID  for  the  object.  i 
I  The  object  is  not  entered  in  the  XLATE  and  the  BRAT  tables.  I 


I 

[Call:  allocObject 


I 

lln: 

I 

I 


RO  First  word  of  object. 

R1  ID  of  the  object. 


AID2  “Object. 

RO  AOOR  pointing  to  object. 


[Criticality  3. 

I 

{Alters  R0/R2/R3/AID2. 


AllocObject : 

MOVE 

R0,R3 

.Save  RO.  Criticality  S. 

MOVE 

RI, ID2 

.'Store  the  object’s  ID  in  1D2. 

AO_Retry: 

MOVE 

[FirstFree. AOl.Rl 

.'Advance  the  heap  scanner. 

AND 

R3,  hdrLengthM, R2 

ADD 

R1,R2,R2 

MOVE 

(LastFree.AO]«RO 

.'Always  leave  three  words  on  the  heap  in  case  a  BRAT  entry 

SUB 

R0<  3,R0 

.-needs  to  be  allocated  for  this  object. 

GT 

R2,R0,R0 

.'Check  whether  the  heap  overflowed. 

BF 

RO, “AO_l 

AND 

R3,  hdrLenqthM, RO 

:lf  it  did,  compact  the  heap,  telling  the  compactor  that 

ADD 

R0,3,R0 

:at  least  three  plus  the  length  of  the  object  words  are  needed 

MOVE 

FIP,R2 

MOVE 

R2,F 

.Criticality  3. 

CALL 

compactHeap 

MOVE 

TRUE, RO 

MOVE 

RO,  F 

.-Criticality  6. 

MOVE 

Ri. r  .r 

BR 

“AO  Retry 

:Go  try  the  allocation  again. 

AO_l  : 

MOVE 

R2,TFiraLFree, A0| 

ROT 

RI,baseN,RI 

.-Create  a  base/address  pair  for  the  object. 

AND 

R3, hdrLenqthM, RO 

OR 

R1,R0,R0 

OR 

RO, rel , RO 

.-Mark  tne  object  as  relocatable. 

WTAC 

RO, ADOR, RQ 

MOVE 

RO,  A2 

.-Store  a  pointer  to  the  object  in  A2 . 

MOVE 

102, R1 

MOVE 

R3, (objectHeaaer, A2 1 

.-Write  the  object's  header  ana  ID. 

MOVE 

RI, /objectio, A2; 

MOVE 

FIP,  IP 

fltAllocObject  -  IP : abs I f ault I uncnecKed I AliocODject<<of f setN 
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##  « 

II  BRAT  H4n«9tr  I 

II  I 

|lllllllll«#ll«|#|| 


I  Ent*r  a  binding  of  R1  to  RO  in  th«  BRAT.  The  BRAT  should  not  hsve  sr  existing  i 

I  binding  of  Rl.  it  may  also  be  appropriate  to  enter  the  object  into  .he  xlate  table,  i 


I 

I  Call:  enterBinding 

I 

lln:  RO  Data. 

I  Rl  Key. 

I 

ICriticality  1. 

( 

lAlters  R0-R3/A1D2. 

I 


EnterBinding*. 
EB  1 ; 


EB  BRATFull! 


EB^HeapFuXl: 


HOVE  R0,R3 

move  Rl, (TempEB_Key,A01 

ROT  Rl, -8RATLenLog«4,R2 

XOR  R1,R2,R2 

ROT  R2, -BRATLenLog*2,  RO 

XOR  R2,R0,R2 

ROT  R2, -BRATLenLog,RO 

XOR  R2,R0,R2 

MOVE  BRATLength-l.RO 

AND  R2,R0.R2 

DC  BRATStart 

ADD  R2,R0.R0 

MOVE  [BRATFree,A0],R2 

BNZL  R2. *E8_BRATFu11 

MOVE  [TempEB  Rey,A0],Rl 

MOVE  Rl, (R2,A0] 

MOVE  {RO.AOI.Rl 

MOVE  R2,(R0,A0] 

ADD  R2,1,R2 

move  R3, (R2.A0i 

ADD  R2,1.R2 

MOVE  1R2,AO],RO 

move  RO, IBRATFree, AO) 

move  Rl, [R2,A0) 

MOVE  FIP,IP 

MOVE  (LastFree,  AOl  ,R2 

SUB  R2,3,R2 

GE  R2, |FirstFcee.AOJ,RJ 

BF  Rl. 'EB_HeapFuU 

MOVE  R2, {LastFree,  AO) 

ADO  R2,2,R2 

move  KIL.R! 

MOVE  Rl,(R2.A0i 

SUB  R2,2,R2 

BR  -eB_2 

MOVE  FIP,R2 

move  3,B0 

move  R2,F 

CALL  compactHeap 

MOVE  TRUE.RO 

MOVE  RC.F 

MOVE  R2,FIP 

MOVE  ITempEB  Key,A0),Rl 

BR  “EB  1 


.'Criticality  6.  Save  data  in  R3 . 

.-Save  the  key. 

;Calculate  the  hash  code  for  the  xey. 

.‘The  hash  code  is  the  XOR  of  the  BRATLenLog-bit  fields  of 
.'the  key. 


;R2  contains  a  hash  code  between  0  snd  BRATLength-1 . 

;R0  points  to  the  head  of  the  BRAT  chain. 

:R2  points  to  a  free  BRAT  link. 

.-Compact  the  heap  if  the  BRAT  is  full. 

.’Store  the  key  in  tne  link. 

.Save  the  second  link  in  the  cnain  in  Rl. 

.-Make  this  link  be  the  first  in  the  chain. 

.-Store  tne  data  in  the  link. 

;Pot  the  next  free  link  In  BRATFree. 

.'Link  with  the  second  link  in  the  chain. 


.•Attempt  to  allocate  three  words  from  the  back  of  the  heap. 

;Go  compact  the  heap  if  it  was  full. 

:Store  a  NIL  in  the  link  woro  of  the  new  entry  and  go 
;allocate  this  entry  in  the  BRAT. 


.-Save  the  FIP. 

;At  least  three  free  words  are  needed  on  the  heap. 
;Criticdilty  3. 


.•Criticality  6. 

.Restore  the  FIP. 

.Restore  the  key  and  go  bac*  to  tne  oeginning. 


f  ItEnterBinding  -  IP  .-abs  i  faul  t  ^unchecked  lEnterBindinq<<of  f  seiK 


1  Lookup  a  binding  of  Rl  in  the  BRAT.  Return  the 

1  Also  return  the  absolute  address  of  the  binding 

1  modified. 

binding  or  NIL  if  there  isn’t  any.  i 

in  the  BRAT  so  that  it  Can  be  i 

1 

I  Call :  lookupBindlng 

)In:  Rl 

Key. 

lOut:  RO 

1  R2 

Data  or  NIL  if  none. 

Absolute  address  of  data  in  the  BRAT  ivalid  only  when  ROOM’.). 

ICrlticility  5 

lAlters  R0/R2. 

1 

LookupBindlng:  ROT 
XOR 
ROT 
XOR 
ROT 
XOR 
MOVE 
AND 
DC 
ADD 

LB_Next :  ADD 

MOVE 
BNIL 
EO 
BF 
ADD 
MOVE 
MOVE 


Rl, -BRATLenLog'4,  R2 
R1,R2,R2 

R2,  «BRATLenl.oq»2,  RO 
R2.R0,R2 

R2, -BRATLenLoq,  RO 
B2, RO, R2 
BRATLength-l,RO 
R2,R0,R2 
BRATStart-2 
R2,R0,R0 
R0,2,R0 
[RO,AO'j,RO 
RC, “ LB  Done 
Rl,  )R0>0J,R2 
R2. ‘LB  Next 
RO,  1,R2 
fR2,A0I,R0 
FIP,  IP 


.Criticality  6. 

.’Calculate  the  hash  code  for  Rl  . 

.'The  hash  code  is  the  XOR  of  the  four  bytes  of  Rl, 
.'the  same  as  tne  xlATE  hash  cooe. 


;n2  contains  a  hash  coac  betwcer.  C  and  BRATLenqth- 1  . 


Follow  the  iinkeo  list  of  BRAT  entries  starting  with 
the  one  in  RO.  Le.ive  if  RO  is  NIL. 

Coirpare  the  xey  ^gainst  Rl  . 

Ctieck  me  next  entry  if  it  doesn’t  match. 

Otherwise  ret-urr  this  entry’s  data. 
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LB_Done;  MOVE  FIP, IP 

fltLookupBindinq  -  IP  :ab«  I  f ault  I  unch«ck*d  i  LooltupBlndin9«of  f  setK 


I  a  blndinq  of  R1  in  th*  BRAT.  Halt  if  no  suen  bindinq  existed.  I 

I  The  purqeBindlnq  entry  point  also  purges  the  binding  from  tne  xlate  table.  l 


f 

ICall:  deleteBindinq 
ICall:  purgeBinding 
I 

I  In:  R1  Key. 

I 

ICrlticality  S. 

I 

(Alters  RO. 

I 


PurgeBinding :  hove 

ENTER 

DeleteBinding :  HOVE 
MOVE 
ROT 
XOR 
ROT 
XOR 
ROT 
XOR 
MOVE 
AND 
DC 
ADD 

DB  Next :  ADD 

MOVE 
BNIL 
EO 
ar 

ADD 

MOVE 

MOVE 

MOVE 

MOVE 

SUB 

MOVE 

MOVE 

MOVE 

MOVE 

OB  Halt :  HALT 


NIL.RO 

K1,R0 

R2.FOPO 

R3,FOPl 

Rl, -BRATLenLoq«4, R2 
RUR2.R2 

R2, -BRATLenLoq*2,  RO 
R2.R0,R2 

R2, -BRATLenLoq,  RO 

R2,R0,R2 

BRATLength-1 , RO 

R2.R0«R2 

BRATStart-2 

R2, RO, R2 

R2, 2, RO 

[R0.A0J,R2 

R2, *DB  Halt 

R1 ,  :  R27a0 i , R3 

R3,  ''DB  Next 

R2,2,R2 

IR2.A0I,R3 

R3. IRO, AO; 

[BRATPree, ACi,R3 
R3, IR2. AOi 
R2.2.R2 

R2, iBRATFree, AO; 
FOPl,R3 
FOPO, R2 
FIP.  IP 

haltBRATDelete 


.Criticality  6. 
.Criticality  6. 


Purge  the  object's  binding  from  the 
Save  R2  and  R3. 


.‘Calculate  tne  hasn  code  for  R1 . 

.'The  hash  code  is  the  XOR  of  the  four  bytes  of  Rl. 
.'the  same  as  tne  XLATE  hash  code. 


;R2  contains  a  hash  code  between  0  and  BRATLength-l . 


.-Follow  tne  lirvxec  list  of  BRAT  entries  starting  with 
;tne  one  in  ftO.  Leave  if  RO  is  NIL. 

.Compare  tne  xey  aqainst  Rl. 

.  Cnecx  tne  next  e.ntry  if  it  doesn't  match. 

.Otherwise  delete  this  entry. 


fltPurqeBincinq  •  IP : abs i fault i unchecked i PurgeBind i nq<<cf ! sets 
fit  DeleteBinding  -  lP:abs{faulti  unchecked  i  DeieteBir.ding<<of  f  setN 


Cosmos.m 


XLATE  table. 
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I* 

4#  Object  ana  Coniaxt  Hanaqer  #• 

«»  I* 


1  Allocac*  and  initialize  a  new  object  on  tne  local  neap  ana  enter  it  in  tne  X'^Tt  ana  i 
I  BRAT  tables.  RO  contain*  the  class  of  the  object.  i 


iCall:  newLocalOb ject 

iln:  RO  Object's  class. 

I 

lOut:  RO  Object's  ID. 

I 

iCritlcality  1 . 

( 

lAIters  R0-R3/AID2 . 


I  Allocate  and  Initialize  a  new  object  on  the  local  heap  ana  enter  it  in  the  XLRTE  ana 
I  BRAT  tables.  RO  contains  the  word  to  be  storca  as  the  first  wora  of  tne  object. 

I  The  length  is  extractea  from  RO,  ana  the  flags  in  the  high  bits  of  RO  shoula  be  set 
I  to  benign  values.  The  object  gets  the  next  unused  ID. 


(Call :  allocNextObject 

I 

lln:  RO  First  word  of  object. 

lOut:  A1D2  ‘Object . 

I  RO  Object's  ID. 

i 

ICritlcality  3. 

[Alters  R0-R3/AID2 . 


I  Allocate  and  initialize  a  new  object  on  the  loca.  heap  ana  enter  it  in  trt*  XLATL  ana 
I  BRAT  tables.  RC  contains  the  word  to  be  stored  as  tne  first  woro  cf  tne  object - 
I  The  length  is  extractea  from  RC,  ana  tne  f.aqs  in  the  high  bits  of  RC  showid  be  set 
!  to  benign  values.  R1  contains  the  ID  to  be  usea  for  the  object. 


Call  -.  allocNewObject 

In:  RO  First  word  of  object 

R1  Object ' s  ID, 


Out:  AID2 

RO 


‘Object . 
Object's  ID. 


ICritlcality  3. 
(Alters  R0-R3/AID2. 


NewLocalObject ; 

MOVE 

f:p,r2 

Cr  it  ica  . 

ity  6. 

MOVE 

R2.  F 

.-Criticality  1  - 

Xl>Tt 

R0,oDj*ctXLATE.A2 

.Get  the 

object • 

s  first  wora. 

MOVE 

[oClassword, A2 | .  RO 

MOVE 

; LastObject ID,  AO  1 

,R'. 

;Get  the 

next  oc 

ject  ID. 

ADD 

RI,  (l<<serialN)'>l 

,  R1 

ADD 

RI,  l.RI 

MOVE 

Rl,  [LastObjectlD, 

AO 

,•  Aovance 

tne  ot>  j 

ect  ID  counter 

BR 

‘ANO_2 

All ocNext Ob ject 

MOVE 

: La  stOb ject 1 D, AO , 

,  RI 

; Cr i t ica . 

ity  e. 

Get  the  next  object  ID. 

ADD 

R'.,  (l<<seriaiN)-I 

,  R 

ADD 

R : . ; ,  R 1 

MOVE 

R 1 ,  I  La stOb ject ID, 

AO: 

.  Advene*- 

me  obj 

ect  ID  counter 

A1 locHewOb ject : 

MOVE 

ElP. R2 

.-Critical 

;ty  h 

save  FIP- 

MOVE 

H7.  F 

; Cr i t ica . 

:ty  b. 

ANO_2 : 

MOVE 

R2,  ITempANC  FIP.AC 

CALL 

a  I . ocOb  ject 

; A .  . OCJt  e 

t.ne  oL 

jeet 

ENTER 

R ;  ,  R  0 

;Pwl  .t  i 

nto  the 

xlate  cacr.e  anc  the  BRAT  table 

CALI 

enterBlnding 

MOVE 

ID2, RC 

; Loaa  the 

Object 

•s  ID  inif.  HC. 

MOVE 

: TempANO  F  I  P, AO : , 

IP 

f  it  New  Loca  lObject  -  IP:aDsfa.,>t  unchecsecJi  New  Loca  .CD  jcct  <<of  f  set  s 
f  It  A1  locNextOb  ject  •  I P  :  abs  '  f  au  1 1  unchecKed  i  a.  .ocNext Oo'ect  <sr.  f  f  set  n 
f  1  tAi  locNewOb  ject  •  IPiabsfai^ItiunchecueaiAl  iocSpwOD'ect  <<cf  f  set  n 


Allocate  and  ir.it.a. 
context  i.nat  sr.ou.a 
before  that  context 
are  copied  intc-.  the 

;  z  e  a  new  c  o  e  x  t  .  If  I D  1  .  s 

be  dcal.ccatea:  newever.  i*  A.* 
IS  dea  1 ,  cc  ;  t  ^  vj ,  its  loca.s 
new  context 

O''  e-'iry,  :t  i.c . 

‘  he  iO'-.ger  pc  . '  t  s  tc 
.  o.  at  L  O'-  s  -■*  t  •  r ;;  -o'  I  '  , 

- 1  s  t  c  a 

"■es  sage. 

. '  .-...si  ve. 

Call:  newCont  ext 

1  I  n :  RC  First  wor-a  oi  c-^ntext,  ces. 

'PC  .e-qf 

AID’.  "rc' t  ex 

'.  Cr  Nil  I*  r.o-e  a’.re.jay  ex.st 

s 

1  Out  :  ail:  New  m.r 

Cr  it  ica  1  ’1  ty  2  . 

(Alters  RO/RI /AICI . 

text 
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NewContexi 


MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

HOVE 

MOVE 

CALL 

MOVE 

BNIL 

MOVE 

BT 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 


NC_HadMessaqe : 
NC  NoOidCxL : 


K I  P ,  R I 

Rl,  fT€mpNC_f:P, AOi 
Rl,  F 

R2, IT«mpNC  R2,A0] 
ID2,R1 

RI , [TtnpNC  1D2. AO] 
R3, (T«mpNC3R3, AO) 
aliocMRxtObjecc 
IDURO 

RO,  “NC_N001<lCxt 
Q/RO 

RO,  ''NC_Ha(lMesiaqe 

(2.A11.R0 

RO. (2. A2] 

O.Alj.RO 
RO, 13, A2] 

[4.A1],R0 
RO.  [4,A2] 

(5, All. BO 
RO,  f 5, A2; 

(6,  A1  ]  ,R0 
RO.  (6.A2! 

[7.A11.RQ 
RO.  ( 7, A2 1 
(S.All.RO 
RO. [8,A2| 

I9,A11,R0 

RO, (9,A2) 

I  10, A1 ) ,R0 
RO.  [10. A2} 
iU,All,R0 
RO,  (  1  l,A2i 
I  12, A1 1 , RO 
RO, {12. A2; 

( 1 3, A1 ; , RC 
RO.  |13,A2J 
{ 14, A1 1 ,R0 


Criticality  6.  Save  R2,  R3  and  tne  FIP. 

Criticality  3. 

save  ID2  in  TempNC_I02. 

.'Create  the  context  oDjeci . 


:lt  A3  did  not  point  tc  a  message,  copy  the  old  context's 
.‘locals  into  the  new  context. 


MOVE 

RO, {14, A?; 

MOVE 

1  lb, Al), RO 

MOVE 

RC,  ilS.A?] 

CALL 

disposeContext 

•Tn.en  dispose  the  c 

MOVE 

ID?, Ri 

.Point  Al  ana  IDI  i 

XLATE 

Rl.oBjectXLATF.Al 

MOVE 

[TempNC  I0?,ACi,R? 

.  Res' ore  A?  and  ID? 

XLATE 

R2, restoreXLATE, A? 

MOVE 

(TempNC  R2, AOi , R2 

.-Restore  R?  ano  H3- 

MOVE 

' TempNC  R3, AO  1 , R3 

MOVE 

[TempNC'FIP.AC:.  IP 

f  ItNewCont  ext  -  .'P;aB3ifauIt'  ur.cnectced  :  NewCcntext  <<of  i  seiN 


1  Deallocate  a  context,  vf:ich  may  Be  either  a  tast  context  or  a  neap  context. 


ICall:  disposeContext 

I 

lln;  AIDl  Context. 

iCriticality  3. 

i 

I  Alters  RC-R2/a:l'1  . 

I 


1  Dispose  an  oBject .  If  tne  ooject  is  locxec,  it  Is  deleted  as  soon  as  the  un.ocx  i 

I  message  comes  in.  < 


iCall:  disposeOBject 

1 

I  In :  BO  Object , 

I 

ICriticality  3 . 

I 

lAlters  R0-R2, 


DisposeFast Centex 

MOVE 

MOVE 

MOVE 

MOVE 


MOVE  ;  Fa5tConlexiO-..e...C', 
RC,  [ context  Next , A 1 ! 
IDl , RC 

RC,  iFastContexlQje.. 
FIP, IP 


A0’,RC  .-Cr 
,PJt 


e,  AC ! 


t .ca 1  it y  fc 
he  context  Uacx  on 


cont  ext  q^e.je  . 


Di sposeContext ;  MOVE 
ROT 
BT 

MOVE 

Dl sposect ;ect ;  MOVE 
MOVE 
XLATE 
RNI  L 
MOVE 
CHECK 
BT 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE. 

MOVE 


IcontextHeaOer.Al '.RC 

RC, -ndrEast  N, RC 

RC,  "Dispose FasiCor-.tex 

IDl.RC 

FIP, R? 

R?,  F 

RC.  localXlATK,  S'. 

B 1 . "DO  Done 

R2,  iler^pDO  FIP,  a:- 

R 1 , 1  NT, r; 

R2, "DO  Remcte 
TR-v'E,  H? 

R2.  w 
:D2, R? 

RO, ID? 

R  1 ,  A? 

on  lertfieadej ,  A?  .s. 
HI,  -ndr  I.ocsedS,  - 
H 1 , *  DC  Lorxod 
BO,  ;  NOdeM.1  S  X  ,  A'..  ,  'r'  , 

NNS. SO 
RO, » I , RC 
RO,  "DC-  Home 


.Critical. ty  b  Cnecx  wnether  this  was  a  fast  context. 
;  Yes  - 

;No.  Dea.locate  a  norma,  object 
.Critical;ty  fc 
; Cr - t lea . . t y  3 . 

;Cet  the  of’oet  .ocaticn  into  RI. 

.Exit  if  tne  object  was  a  constant. 

.Save  tnp  EIP. 

.DC  send  a  D.speso  message  if  tne  object  is  remote. 
.Enter  •..'•.c.’.erxpc  ncae. 


If  tr:e  C-:  'oct  Is  .ora.,  point  AID?  to  it. 
r.lT*t  Jf.e'.e  ..j.-ROd  cc’ect 


./•erx  er  .s  tn.e  Cbiert's  nomr. 


\ 


203 


Concurrent  Smalltalk  on  the  Message-Driven  Processor 


MOVE 

ID2.R1  ; 

If  .not,  send  a  message  to  the 

ucject’s  home  to  delete 

DC 

MSG :  msgOisposeBRAT-^? 

its  BRAT  entry - 

SEND20 

Rl.RO 

SENDEO 

Rl 

DO_ 

Hone: 

CALL 

deallocateObject 

Deallocate  it. 

XLATE 

R2, restoreXLATE.A? 

Restore  AID2. 

MOVE 

(TempDO  F2P, AO] , IP 

DO 

Done : 

MOVE 

R2,IP 

do' 

Remote: 

MOVE 

R0.R2 

Send  a  Dispose  message  to  the 

object  *  s  node . 

SENDO 

Rl 

DC 

MSC:msgDispose^2 

SEND2E0 

RO.  R2 

MOVE 

[TempDO^FIP, AO)  ,  IP 

DO_ 

Locked : 

ROT 

Rl , hdr LockedN-hdrDeletedh 

I,  RO 

OR 

R0,1,R0 

If  the  Object  is  locked,  mark 

It  «s  deleted  but  do  not 

ROT 

.<0,  hdrDeletedN,  RO 

delete  it  yet . 

MOVE 

RO, [objectHeader, A2} 

MOVE 

(TempDO_^FlP,  AO) ,  IP 

f ItDlsposeContext  •  IP:«bsifaulti unchseked i Dlspos«Context<<of I setN 
f  ItDixposeObjdct  »  lP:abs  i  fault  iDispoadObject<<offsei.N 


Execute  a  Dispose  messaqe. 


Dispose:  MOVE  { dl sposelO« A3] ,  RO 

CALL  dl sposeObject 
SUSPEND 

msqDlspose  -  DiBpose<<of f setN 


.‘Criticality  2. 

: Dispose  the  object. 


:  Execute  a  DisposeBRAT  message.  M  the  object  was  present  or.  its  home  node,  it  is 
I  disposed;  otherwise,  oniy  the  object’s  home  BRAT  entry  Is  deleted. 


DisposeBRAT: 

MOVE 

i di sposeBRATiD, A3 } , R 

XLATE 

Rl, localXLATE,  R2 

CHECK 

R2, ADDR, R3 

BT 

R3,*0BRAT  Here 

CALL 

SUSPEND 

purqeBindIng 

DBRAT^Here : 

MOVE 

R1,R0 

CALL 

SUSPEND 

disposeOnject 

;Criticaiity  2. 

.Check  if  the  object  is  hero  too. 

;If  sc.  dispose  it. 

.-Purge  tho  object’s  binding  from  the  XLATE  taole  ano 
.-fro.m  the  BRAT. 

;0:spose  the  object. 


msgOisposeBRAT  •  DlsposeBRAT<<off seiN 


I  Deallocate  an  object  residing  on  this  node.  l‘'e  object  must  not  be  locked. 


I  Call :  deallocateObject 

I 

(In:  AI''2  Object, 

icritical^ty  4. 
lAlters  RO/Rl . 


DeallocateObject:  MOVE 
MOVE 
DC 

MOVE 
OB 

HOVE 
MOVE 
CA1.1 
MOVE 
MOVE 
MOVE 

f ItDeai iocateobject  -  :P;abs!fajltiuncneckediDca  * locateOb^oct  <<of  f  set K 


FIP.Rl  .Criticality  6-  Save  the  FIP. 

Rl, (TempDealioc  FIP.AC 
OBJ;hdrOeleted 

R'l ,  F  ;  Cr it  ical  Ity  4  . 

RO,  !  Ob  ject  Header,  A2J .  BO  .-Set  the  deleted  flag  in  the  object  heaoer. 

RO.  i  Ob jectvieade t .  A2  1 

[object  ID,  A2 ),  R 1  .Delete  the  object's  binding  Iroir.  the  BRAT  ana  the  xlate  table. 

purgeBindlng 

NII..R0 

RO, :D2  -C.ear  ID? . 

[ TempDea llOcFIP.AOj, IP 


2C4 


Appendix  F 


Cosmos  Listing 


Cosmos.m 


II  II 

II  GlobAl  Object  Meneger  II 

II  II 


I  Allocate  and  initialize  a  new  object  on  the  heap  of  a  random  node  and  enter  it  in  i 
I  that  node's  XX>TE  and  BRAT  tables.  RO  contains  the  class  of  the  object.  I 


I 

ICall:  newObject 
I 

lln:  RO  Object's  class. 

I 

lout:  RO  Object's  ID. 

I 

icriticality  l. 

I 

(Alters  RO/Rl . 

I 


NetK}bject : 


MOVE 

FIP.Rl 

.•Criticality  6. 

MOVE 

R1,F 

.•Criticality  3. 

MOVE 

Rl,  (contextlP, Al) 

.•Save  the  state  in  the  context. 

MOVE 

R2,  IcontextR2, AI ] 

MOVE 

R3,  [contextR3. Al ) 

MOVE 

RO,  R3 

.-Save  tne  class  in  R3. 

DC 

RandomSeedIncrement 

MOVE 

[RandomSeed, AO ) , R2 

.-Generate  a  random  node  number. 

ADD 

R2,R0.R2 

.•Advance  the  random  node  counter  and  return  its 

new  value 

MOVE 

R2, [RandomSeed, AO ) 

AND 

R2, i NodeMask, AS ! , R2 

DC 

MSG : ffisqNewOD'ect  *4 

.-Send  a  KewUbject  message  to  that  node. 

SEND20 

R2,  RC 

MOVE 

1D1,R2 

SEND2C 

R3,  R2 

SENDEO 

contextRU 

DC 

CFUT:coniextRC 

MOVE 

RO, (contextNext, Al j 

;Tell  the  context  to  wail  for  tne  quasi-cf uture 

in  RC. 

MOVE 

RO, (contextRO, AI ) 

;S*.ore  a  cfutjre  in  RO. 

DC 

SaveSt4terD023-  (••'21 

;Co  save  t.he  context  ano  suspend. 

BR 

RO 

fltNewCoject  -  1 P ; abs i fault i unchecKea i NewObject <<oi f set N 


I  Execute  a  NewObject  message.  Return  the  object's  ID  to  the  caller. 


NewObjectM: 


MOVE 

|newObjClass,A3,,RC 

icriticality  C. 

CALL 

newLocalObject 

.Allocate  the  object  .ocally 

MOVE 

R0,R2 

DC 

MSG:msgReply*l 

.•Reply  with  the  object's  ID. 

MOVE 

(newObjReplyID,A31,R: 

SENO20 

Rl.RO 

SENDO 

r: 

SEND2E0 

SUSPEND 

[newObjRepiySlot, A31  ,R2 

msgNewObject  •  NewObjectM<<of f setN 


I  Return  the  class  of  an  object.  Typeof  returns  the  class  as  an  integer,  while 
I  classOf  wraps  it  as  a  class.  Tne  argument  of  TypeOf  must  not  oe  a  future. 


I 

ICall:  ClassOf 
ICall:  typeof 

I 

lln:'  RO  Object. 

I 

lOut:  RO  The  object's  class. 

I 

ICriticality  1 
I 

lAlters  R0/R1/a:D2. 


ClassOf:  MOVE 

MOVE 
BNNll. 

TOf_2-.  MOVE 

CALL 
MOVE 
ROT 
OR 

WTAG 
MOVE 

fltClassOf  -  I p : abs ' fault  I ClassOf <<of f SPIN 


FIP, R. 

Ri.  r 

RC,  "TOf  _7 

Rl , ITempTOf  Y  IP. AC 
t  ypeOf 
SuoC  IASS, R 1 
Rl ,  subtagN,  R'. 

RO, Rl , RO 

RO, TACO, RC 

‘TempTCf  FIP, AC., IP 


Criticality 
Critical ity  1 . 

Force  tne  a :  qu.me^t  . 
Save  tne  r  I P  in  r;»'ncry 
Get  the  ir.teQrr  type 


Save  the  FIP 


a  writ*'  its  tag  ana  subtag. 


TypeOf : 

RTAG 

RO,  Rl 

BR 

Rl 

ROT 

RO, 'SuOtagN,  R 

BR 

•TOf_  TACC 

BR 

“TOf" Integer 

BT 

RO,  “TOf  TrviP 

dR 

■TOf_Fa:se 

MALT 

halt  TypeCf 

HALT 

ha  1 tTypeCf 

MALT 

f,a  1 1  Typeuf 

HALT 

ha  I tTypeCf 

HALT 

ha ItTypeOf 

.Critica.ity  b.  L.spatcr.  on  the  tag  of  the  object. 


:  TAG 


C 


;  :nt 

.IKX'I 

;  AOLH 
;  IP 

;MSG  /  Uii- 


;FuT 
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MOVE 

FIP*  R1 

; ID.  Save  the  TIP . 

BR 

*TOf  Object 

MOVE 

FIP.Rl 

.-DID.  Save  tne  FIP. 

BR 

“TOf  Object 

HALT 

haltTypeOf 

;  TAGA 

BR 

-•TOf  Float 

.•FLOAT 

HALT 

haltTypeOf 

:  INSTQ 

HALT 

haltTypeOf 

; INSTl 

HALT 

haltTypeOf 

; 1NST2 

HALT 

haltTypeOf 

; INST3 

TOf  Integer: 

DC 

classlnteger 

.-Return  the  integer  class. 

MOVE 

FIP, IP 

TOf  True: 

DC 

classTrue 

.•Return  the  true  class. 

MOVE 

FIP* IP 

TOf  ralee: 

DC 

claisFalse 

.‘Return  the  false  class. 

MOVE 

FIP, IP 

TOf  Float: 

DC 

classFloat 

;Return  the  float  class. 

MOVE 

FIP, IP 

TOf  Object: 

MOVE 

Ri*r 

.•Criticality  1. 

XLATE 

RO* ob jectXLAT£*A2 

MOVE 

(Ob ject Header, A2 1 ,  RO 

.‘Extract  the  class  from  the  object 

MTAG 

RO* INT, RO 

ROT 

R0.-hdrClassN*R0 

AND 

RO*  hdrClassM.RO 

;R0  now  contains  tne  class. 

MOVE 

Rl*  IP 

TOf  TAGO: 

AND 

R1 , subtaqM*  Rl 

.‘Dispatch  on  tne  subtag. 

BR 

Rl 

BNNIL 

RO*  *TOf  Symbol 

: subSTK 

BR 

“TOf  NIL 

HOVE 

FIP,R1 

.’SubCLASS.  Save  the  FIP. 

BR 

“TOf  Object 

BR 

“TOf  Selector 

:  subSEL 

BR 

“TOf  Character 

.*  SUbCHAR 

HALT 

haltTypeOf 

HALT 

haltTypeOf 

HALT 

haltTypeOf 

HALT 

haltTypeOf 

HALT 

haltTypeOf 

HALT 

haltTypeOf 

HALT 

haltTypeOf 

HALT 

haltTypeOf 

HALT 

haltTypeOf 

HALT 

haltTypeOf 

HALT 

haltTypeOf 

HALT 

haltTypeOf 

TOf  Symbol: 

DC 

classSymooi 

.-Return  tne  symbe;  cisss. 

MOVE 

FIP*  IP 

TOf  NIL: 

DC 

classNul 1 

.-Return  me  nul.  Cisss. 

MOVE 

FIP,  IP 

TOf  Selector; 

DC 

ciassSelector 

.-Return  the  selector  class. 

MOVE 

rrp*  IP 

TOf  Character: 

DC 

classCharacter 

.-Return  the  character  class. 

MOVE 

FIP,  IP 

flt7yp«Of  •  IP  :  abs  I  f«ulc  I  unchecited  I  TypeOf  <<of  f  setN 


Return  th«  noci«  on  which  th«  object  might  reside.  If  the  object  it  t  constant* 
return  a  random  node  number.  If  the  object  is  a  CIO*  return  a  random  constituent. 


Call:  objectNode 


1 

tin:  RO 

Object 

tout:  Rl 

Number 

tagged 

of  node 
INTeger 

likely  to  contain  object.  The  number  may  no 
and  it  may  contain  junk  oata  in  the  high  Ifc 

necessarily  be 
bit  s . 

iCrltlcalit-y  b 

1  Alters  RO/Rl . 

1 

ObjectNode : 

RTAG 

BR 

R0,R1 

Rl 

.•Criticality  6. 

.-Dispatch  on  the  tag  of  the 

object . 

MOVE  iRindomSeed,AO/,Rl  .TACO 

BR  '*ON_Random 

MOVE  (RandomSeed,  AOJ ,  R'i  ;INT 

BR  “ONRandom 

MOVE  [RandomSeed, AC!*R1  ; BOOL 

BR  “ON_Random 

HALT  halt  Interna  iType  .-ADDR 

HALT  haltlnternalType  .-IP 

HALT  haltlnternalType  .MSG  /  OBJ 

HALT  haltlnternalType  ;CFt/T 

MOVE  R0,R1  ;F'UT.  Return  the  F’JT;  the  noae  r.umher  is  in  the  low 

MOVE  FIP,  IP  ;  le.  bits 

MOVE  RC,R1  ;ID.  Return  the  10:  the  node  .•'.umber  is  in  the  low 

MOVE  F3P,  IP  .-16  bits. 

MOVE  R0,R1  .0:0 

BR  “'Randomronst 

MOVE  IRandomSeeo,  AC  j ,  H’.  AGA 

BR  ■‘CN^Randoit 

MOVE  [RandomSeed,  AO R’,  .f:x;AT 

BR  "ON_Random 

HALT  ha  1 1 1 nt erna 1  Type  .ISSTO 

HALT  haltlnternalType  ;1KST1 

HALT  haltlnternalType  ;IN5T7 

HALT  ha  1 1 Internal  Type  ;;sS!3 

ON_Random:  DC  RandomSeed' ncrewent 

ADD  R1,RC,R'.  .Aovanre  tfe  ra:  o:—  node  cu..t‘.er  ana  rvtjrr  its  new 

MOVE  r:, [RandomSeed, AC 

AND  Rl, ! NodeMasK, AC : , R . 

MOVE  FIP, IP 
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Cosmos  Listing 


Cosmos.m 


RandonConst : 

MOVE 

R2,  FOPO 

ROT 

Rl, -flodStriaeN»iooStrideL), A2  .-Rl  has  s,  the  distobj  initial  node  number^  in  bits 

ASH 

R2,-ie,R2 

.■and  e,  2*a  complement  loqStride,  in  bits  11..  IS. 

ASH 

R2« logStrideL-16. R2 

;R2:-e. 

MOVE 

•1,R0 

ASH 

R0,R2,R2 

:R2:*e  zeros  in  LSBs  with  the  MSBs  being  ones. 

MOVE 

{ RandoreSeed, AO } , RO 

ADD 

RO.  7,R0 

.'Advance  tne  random  node  counter  and  return  its  new  value. 

MOVE 

RO, {RandomSeed« AO) 

AND 

R0,R2,R2 

BR 

‘‘CetConst 

fltObjectNode 

•  IP:abslfauiti unchecked i Object Node<<of  f set N 

i  Return  the  ID  of  the  preferred  constituent  of  a  distributed  object  with  the  qiven  I 
)  DIO.  I 

- - - - - ... - - - - - - - -♦ 

I 

I  Call :  pref erredConst ituent 
I 

lln:  R1  DID. 

I 

tout:  R1  ID. 

I 

iCritlcallty  5. 

I 

I  Alters  RO/Rl . 

I 


PreferredConst : 


CetConst : 


PrefCnst  Dense: 


MOVE 

R2,FOPO 

.Criticality  $. 

ROT 

R1 , * <loqStrideN«logStrideL> , R2  :R1  has  t,  the  distobj  initial  node  number,  in  bits  0..10 

ASH 

R2,-16,R2 

.-and  e,  ?*s  corplemont  loqStrlde,  In  bits  11.. 15. 

ASH 

R2,  loqStriaeL-16, R2 

;R2:-e. 

LE 

R2, 0, RO 

.-Jump  to  a  faster  routine  if  the  distributed  object  is  dense. 

BT 

RO, “PrefCnst  Dense 

MOVE 

-1,R0 

ASH 

RO,  R2, R2 

;R2;“e  zeros  in  LSBs  with  the  MSBs  being  ones. 

MOVE 

ISerialNode,A0),R0 

AND 

RO,  R2, R2 

;R2:*maskea  serial  node  number. 

DC 

inlt lalNodeM 

AND 

R0,R1,R0 

OR 

RO, R2, R2 

;R2:“aerial  constituent  node  number. 

DC 

ID : -homeNooeM 

AND 

R0,R1,RI 

; R1 1 D: ser ia 1  number . 

AND 

R2, xM. RO 

OR 

R1,R0,R1 

.Store  the  x  nooe  rsumoer  in  Rl. 

ROT 

R2.  -XL,  RO 

AND 

RO, yM. RO 

ROT 

RO, yN. RO 

OR 

BI.R0,R1 

.•Store  tne  y  nooe  number  in  Rl  . 

ROT 

R2,-  (xL^yLl,  R2 

AND 

B2, ZM,R2 

ROT 

R2,ZN,R2 

OR 

R1,R2,R1 

.•Store  the  z  node  r.jnoer  in  fti  . 

MOVE 

FOPO,R2 

MOVE 

FIP,  IP 

DC 

10 : -homeNodeM 

AND 

RO. R1 , R1 

;Rl :-:D; serial  number. 

MOVE 

NNR, RO 

.-This  is  a  dense  distributed  object;  just  use  the  current  node. 

OR 

R1,R0,RI 

MOVE 

FOPO,R2 

MOVE 

FIP,  IP 

fltPref erredConst ituent  -  IPiabsifault' unchecneo i Pref er reoConst<<oI ; setN 


1  Return  the 

ID  of  the 

nth  constituent 

of  a  distributed  object  with  the  given  DID. 

ICall:  CO 

lln:  RO 

n . 

1  Rl 

DID. 

lOut:  Rl 

ID. 

1  Criticality 

1  . 

[Alters  R0-R2 
1 

Co: 

MOVE 

H2, FOPO 

;Critlcality  6.  Save  H2  and  R3. 

MOVE 

R3. fop: 

CHECK 

RO, INI. R2 

BF 

R2, “Co  HadType 

CHECK 

R1,DID,R2 

BF 

R2,*Co  Badlype 

LT 

RC, C, R2 

BT 

H2,“Co  Qadka.nqe 

ROT 

R 1 ,  -  UoqSi  r  ;aeN 

•  logSt  r  idel.) ,  H2  .Rl  has  s,  t.ne  0;stoD-^  initial  node  number,  in  bits  0 

ASH 

R2,-16,R? 

;ar.d  c,  2‘s  ccrp.empr.t  logStride,  in  bits  ’.  1  .  .  1 5 . 

ASH 

R2 ,  loqSt  ride;-’. 

6,R2  ;R?:-e. 

LT 

R2,  Q.HJ 

BP 

B3,  “Cc  Spar.se 

.Tf.pro  Is  at  .most  or.o  constituent  per  nooe. 

NEC 

B?,R2 

.There  are  multiple  cor.st  i  t  uenl  s  per  node. 

MOVE 

-I.R3 

ASH 

R3, B2, B3 

NOT 

R3,  R3 

;H3  ronta.ns  trie  ri.moer  of  constituents  per  node  minus  one. 

AND 

RO, R3, R3 

;Moa...c  r  by  the  rum.ijer  of  constituents  per  node  to  get  the 

ROT 

R3,  ser  ialN,  R,' 

;  01  sp,  aremc'f  ’  to  t:e  aaoeo  to  the  DID's  serial  number. 

AOn 

ri,r3,h: 

KEG 

R2.  R2 

ASH 

HO. R2, HC 

.Divide  n  by  the  r.u-.ber  of  constituents  per  node. 
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MOVE 

0,  R2 

.-Now  assume  that  there 

is  one  constituent  per  node. 

Co  Sparse: 

FFB 

RC,  R3 

;R3:-30“lqin) . 

SUB 

R3, R2, R3 

;R3 :«30-lqtnode*> . 

ASH 

RO, R2, R2 

; R2 : -nodet ■ 

MOVE 

30-LogNNodes,  RO 

;RQ;-30-‘lg(NNoaes)  . 

GT 

R3,  RO, RO 

BF 

RO, 'Co  BadRange 

HOVE 

FOP  1 .  R  J 

BR 

'CetConst 

Co  BadRange 

HALT 

haltRange 

Co  BadTvbe: 

CHECK 

R0,FUT,R2 

:If  either  operand  was 

a  future,  crash  with  the  future 

BT 

R2, 'Co  Future 

.'Otherwise,  crash  with 

the  type  fault . 

CHECK 

R1,FUT,R2 

BT 

R2, 'Co  Future 

HALT 

haltTy^ 

Co^Future : 

HALT 

haltFuture 

fltCo  -  IP: 

abs 1  fault  1  unchecked  1 Co<<of f setN 

I  Bx«cut«  »  R*questObj*ct  mejxxqe.  The  10  passed  in  the  reessaqe  must  he  a  word  tsqqed  i 
I  ID,  •  clsas  or  a  selector:  It  cannot  be  a  future,  constant,  or  distributed  object.  | 


RequestObject : 

HOVE 

[reqObjID.A3),R3 

:Criticality  4. 

XLATE 

R3, lOCalXLATE, R1 

;Is  the  object  here? 

CHECK 

Rl, ADDR,R2 

BT 

R2. “RO  Local 

.'Yes . 

RO_Resend : 

DC 

MSG : msqRequestOb ject *3 

;No.  Send  a  message  requesting  the  object  to  the  object's 

SEND20 

Rl.RO 

.'lixely  location. 

SENDO 

R3 

SENDEO 

SUSPEND 

( reqObj Reply Node, A3) 

RO_Locked : 

MOVE 

NNR, RI 

.-Criticality  4.  Resend  tne  message  back  to  this  node. 

BR 

'RO  Resend 

RO_Local : 

MOVE 

R3, 102 

.-Criticality  b.  Point  A1C2  to  the  object. 

MOVE 

Rl,  A2 

MOVE 

lob ject Header , A2i ,  R2 

ROT 

R2, ‘hdrLockedN,  RC 

;Resena  the  message  oacx  to  tn.s  node  if  the  object  is  locked. 

BT 

RO, 'RO  Locked 

AND 

R2, hdr  LenqthM, R3 

DC 

MSG tmsqMiqrateOb ject  *1 

.-The  length  o!  tne  message  is  one  plus  the  length  of  the 

ADO 

RO, R3, RO 

; Object  - 

SUB 

R3,  1,R3 

MOVE 

(R3,A2J,R3 

.'Save  the  last  word  of  tne  object  in  R3. 

SUB 

Rl,  1,R1 

:Sr.orten  the  object's  limit  oy  one  word. 

MOVE 

Rl,  A2 

SENO20 

( reqOb jRepi yKodo,  A3 { ,  RC 

.-Send  the  message  header. 

SENDO 

R2 

: Sene  the  words  of  the  object. 

MOVE 

0,  RO 

CALL 

blockSend 

SENDEO 

R3 

.‘Send  the  last  word  of  tne  object. 

ROT 

R2,  -ndrCopyableN,  RO 

.-Leave  object  .here  if  J:  is  copyacle. 

BT 

RO, 'RO  Copyabie 

MOVE 

102, Rl" 

;If  the  Object  is  not  copyabre,  purge  it  from  the  xlate  table 

AND 

Rl, [NodeHask,A0I.R2 

.-and  from  t.ne  BRat,  jn.pess  this  is  its  home  r\ode,  in  which 

MOVE 

NNR, R3 

.'Case  purge  it  from  tne  xlate  table  and  replace  its  BRAT  entry 

EQUAL 

R2,R3.R0 

.-to  suggest  that  it  is  present  on  this  node;  messages  requesting 

BT 

RO, 'RO  Home 

;the  Object  will  xeep  cycling  at  this  node  until  the  object's 

CALL 

deal 1 oca teOb ject 

;new  location  is  known. 

RO^Copyable : 

SUSPEND 

RO^Home : 

DC 

OBJ  rhdrDeleted 

OR 

RO,  Ob jectMeader,  A2  J ,  RO 

.Set  the  deleted  flag  in  the  object  neader . 

MOVE 

RO,  [ objectHeader ,  A2 j 

MOVE 

NIL,  RO 

ENTER 

Rl ,  RO 

CALL 

lookupBindinq 

BNIL 

RO, 'RO  NoBindinq 

MOVE 

SUSPEND 

R3, [R27A0i 

.'Pretend  that  the  object  is  located  at  this  node. 

RO  NoBindinq: 

HALT 

ha ItBRATMi ssinq 

msqRequestOb ject  -  unchecked i Reques t Ob leci <<ot i soiN 


1  Execute  an  AcceptObject  messaqe.  Make  this  node  thi*  object's  hoTP  The  ubjeci’s  ID 
I  must  reflect  tnis  node  as  tne  object’s  home. 


AcceptODject ; 


AO  Done: 


MOVE  ;  2«oD3ect  Header ,  A3  1 ,  RC  :  Cr  1 1  i  1  ity  3.  Read  the  uc.e-t's  header  and  ID, 

MOVE  '?*0bjectlD.A31,Rl 

CALL  al locNewODject  .A, locate  spare  for  the  obect. 

MOVE  A2,B1 

DC  (- ( 1  <<baseNy  « 1  j  •  ?  .-Decrease  A?*s  base  by  twc  wurds  and  increase  its  limit  likewise 

ADD  R1,R0,RC  .-because  the  object  starts  .r.  the  message  two  words  late. 

MOVE  R0,A2 

MOVE  4,R0  ,-CtJpy  the  object  into  tnc  r^eap  starting  from  the  fifth  word 

CALL  biockMove  .-of  the  message  jtt;iro  wora  cf  the  object). 

MOVE  [1,A3;,R3 

BNIL  R3, 'AO  Done  ;  Ac  snow  i  edge  the  leraer  if  a?:  acunow.edqement  was  requested. 

DC  MSG  :  TT  sg  Ac  know  iedqeOb  ject  *2 

SLNDPO  HJ.RC 

SEN DEO  1/ •object  ID, A3' 

SDSPF.ND 


msqAcceptOb  ject  -  unchecked  :  AcceptOb  ject  <<of set  n 
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Appendix  F 


Cosmos  Listing 


Cosnnos.m 


I  Exacute  a  MiqrateObject  Raaaaqe.  If  the  object  is  copyable,  store  a  copy  of  it  on  i 
I  this  node.  If  the  object  Is  not  copyable«  store  it  on  this  node^  lock  It,  and  i 
I  inform  the  home  node  about  the  object's  presence  here.  i 


HiqrateObject : 

HOVE 

(  WobjectHeader,  A3I .  RO 

rCriticality  3.  Read  tne  object's  header  and  ID. 

HOVE 

(l«objectlD,A31,Rl 

ROT 

RO, -hdrCopyableN, RO 

AND 

RO,SFFFFFFF1,RO 

rClear  tne  purqeable,  locked,  and  marked  flags. 

BT 

RO, *MO  Copyable 

;lf  the  object  is  copyable,  make  this  copy  purqeable. 

AND 

Rl, (NodeMa8k,A0],R2 

rThis  object  is  noncopyable. 

MOVE 

NNR,R3 

EQUAL 

R2,R3,R2 

;Cneck  whether  this  node  is  the  object's  home  node. 

BT 

R2,'‘MO  Noncopyable 

:If  so,  do  nothing. 

HOVE 

R0,R2 

rSave  RO. 

DC 

MSG:msqUpdateHome*3 

rOtherwise,  tell  tne  home  node  about  this  object's  location 

SEND20 

R1,R0 

;and  lock  the  object  until  the  home  node  replies. 

SEND2E0 

R1,R3 

OR 

R2, 1<< (hdrLockedN-hdrCopyableN) , RO 

BR 

‘‘HO  Noncopyable 

;  Lock  this  object. 

MO_Copyable : 

OR 

RO, I<< (hdrPurqeableN'hdrCopyableN) ,  RO 

MO_Noncopyable ; 

ROT 

RO, hdrCopyableN,  RO 

rAllocate  storage  for  the  object  and  put  It  into  the  xlate  table 

CALL 

allocObject 

ENTER 

R1,R0 

rCriticaiity  S. 

MOVE 

R0,R3 

CALL 

lookupBirdlnq 

rCheck  whether  a  binding  for  the  object  existed  in  the  BRAT. 

BNIL 

RO, “MO  Unexpected 

HOVE 

R3,  IR27a01 

rif  one  did  exist,  save  its  data  in  R2  and  rebind  the  BRAT 

MOVE 

R0,R2 

rentry  to  point  to  the  object. 

BR 

“MO  Expected 

MO^Unexpected: 

MOVE 

R3,R0 

CALL 

enterBindinq 

;Tnis  obeys  criticality  5  because  allocObject  allocates  three 

MOVE 

NIL,R2 

.-extra  heap  words. 

MO  Expected: 

MOVE 

A2,R0 

.'Decrease  A2's  base  oy  one  word  and  increase  its  limit 

SUB 

RO,  n<<baseN)  -1 ,  RO 

roecause  the  Object  starts  in  the  message  one  word  late. 

MOVE 

RO,  A2 

MOVE 

3,R0 

.-Copy  the  object  into  the  heap  starting  from  the  fourth  word 

CALL 

blockMove 

.-of  the  message  (third  word  of  the  object). 

CHECK 

R2, ID, RO 

BF 

RO, “MO_Suspend 

.-Leave  if  there  are  no  contexts  to  restart. 

MOVE 

Fa St Con text Queue,  AO; ,  RO 

MOVE 

RO, (context Next ,  A1 ) 

.-Dispose  the  current  context. 

MOVE 

ID1,R0 

MOVE 

RO,  1  FastContextQueue,  AO ’ 

MO^NextRestart : 

XLATE 

R2, objeCtXLATE,  a: 

.'Restart  contexts. 

MOVE 

(context Next , A1 ) ,  Rl 

CHECK 

Rl.  ID,R3 

BF 

R3, “Reply  Restart 

.-Restart  this  context  if  there  is  only  o.ne  to  be  restarted. 

DC 

MSG :msqRest art  Context  *2 

.-Otherwise  send  a  message  bacK  to  this  node  to  restart  this 

SEK020 

R2,R0 

.-context  and  go  restart  the  next  one  now. 

SENOEO 

R2 

MOVE 

B1,R2 

BR 

“MO^NextResiart 

MO_Su8pend : 

SUSPEND 

msqHlqrateOb ject  -  unchecked iMi qraceObject<<of f setN 


;|  Execute  a  RestartContext  message.  I 

;l  Execute  a  Reply  message.  i 


RestartContext : 


Beply_2 : 


Reply_Restart : 


Reply 


Reply_Halt : 


MOVE 

[ Fa  St Con text Queue, AO ) ,  RO 

MOVE 

RO, (contextNext,  A1 1 

Put  the  fast  context  back  on  the  context 

queue . 

MOVE 

I01,RO 

MOVE 

RO, I  Fa stContextQueue,  AO  i 

MOVE 

[ replyio, A3 1 ,  RO 

Criticality  3. 

XLATE 

RO, objectXLATE.  Al 

xlate  the  reply  context  i.hto  Al  . 

BR 

“Reply  Restart 

MOVE 

Rl, [RO, A21 

Store  the  vaiue  replied. 

MOVE 

{ Fa StContextQueue, AO  I , RO 

MOVE 

RO,  ( context  Next , Al J 

Put  the  fast  context  bacn  on  the  context 

queue . 

MOVE 

ID1,R0 

MOVE 

RC,  i  Fa StContextQueue,  AO j 

XLATE 

R3.  ObjectXLATE.,  Ai 

Xlate  the  reply  context  into  Al . 

MO’VE 

FALSE,  RO 

Turn  off  A3  qwe..e  wraparo-nd. 

MOVE 

RO.  0 

MOVE 

( context  IDO, Al ) ,  RC 

Restore  the  aodress  and  ID  registers. 

XLATE 

RO, restoreXLATE,  AO 

MOVE 

I  context ID2, Al ) , RC 

XLATE 

RO, restoreXLATE.  A2 

MOVE 

( context ID3, AI ) ,  RO 

XLATE 

RC, restoreXlATE,  Al 

MOVE 

lcontextR3,Al ]  ,R3 

Rcsto.'c  the  oata  registers. 

MOVE 

[ context R2, AI ; , R2 

MOVE 

(contextR i , AI 1 , Rl 

MOVE 

1  cent  ext RO.  AI [ ,  RC 

MOVE 

; context  IP, Ai ; . IP 

Resume  cor.putat  ic:.  of  the  message. 

MOVE 

i repIylD, A3, ,  B3 

Criticality  3. 

XLATF. 

R3,  object  X  lATr.,  A? 

Xlate  the  reply  ;:c:.text  into  A?. 

MOVE 

!  repi ysiot ,  Al ; ,  RC 

WTAC 

RC, CFUT, RO 

MOVE 

!  contexcNex:  .A?  ,  RJ 

cnee*  w/:ctner  tr.i*  process  was  waiii.ng  .'or 

t  h  i  s  s 

EO 

Rl  . RO, R? 

MOVE 

;  repi  yva . ^e, Al , ,  Rl 

BT 

R?, "Reply  ? 

Suspend  if  r.ot . 

EO 

RC,  IRC,A?i,H;’ 

C'-e.'-*  the  pre-.;;?w;s  value  from  the  slot  ano  make 

HE 

R2  ,  ‘RepI  y  Ma  , 

;t  v».is  th(-  proper  cf..ture. 

MOVE. 

Ri  ,  IRC, A? 

Str,r*-  t-.r  -ete  va  1 -€»  there 

SUSPEND 

HALT 

ha  It  Reply 

msqRestartContexi  •  uncheexed  i  Rest  a  rtcor.t  ext  <<of !  set K 
msqReply  -  unchecKediRepiy<<oifsetK 
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I  Ex«cut«  *n  UpdateHome  mattage.  Update  the  BRAT  to  contain  the  object's  new  home  I 

1  location  and  send  an  Unlock  message  to  the  object  to  allow  it  to  move  again.  I 


UpdateHone: 

MOVE 

[updtHomeID,A3],Rl 

.‘Criticality  2. 

SENOO 

jupdtHomeNode,  AS] 

.'Send  an  Unlock  message  back  to  tne  object 

DC 

M5C  :msgUnlocic«2 

SENO2E0 

RO,  R1 

CALL 

looKupBinding 

.'Look  the  object  up  in 

the 

I  BRAT. 

BNIL 

RO, -UH  Halt 

.'The  BRAT  entry  should 

be 

present . 

CHECK 

RO,  INT,R3 

BF 

R3,  '“UH^Waiting 

MOVE 

[updtHomeNode, A3] ,  RO 

.'Change  the  BRAT  entry 

to 

reflect  the 

MOVE 

RO. [R2,A0} 

.-object's  new  location. 

SUSPEND 

UH_Waiting : 

CHECK 

RO, 1D,R3 

BF 

R3, -UH  Halt 

XLATE 

R0,objectXLATE,A2 

MOVE 

(contextNext.A2],R0 

BNIL 

RO, *UH  Halt 

.'The  BRAT  entry  should 

be 

present . 

CHECK 

RO, 1NTTR3 

BF 

R3, *UH  Waiting 

MOVE 

[updtHoneNode«  A3] , RO 

.'Change  the  brat  entry 

to 

reflect  tne 

MOVE 

RQ. [contextNext« A2] 

.'Object's  new  location. 

SUSPEND 

UH^Halt : 

HALT 

haltBRATHisslng 

msgUpdateHome  •  UpdateHome<<of f setN 


.'  I  Execute 
;  I  object 

an  Unlock 
was  marked 

message.  Unlock  the  object 
deleted,  dispose  it  now. 

to  al.ow  it  to  move 

again .  I f  the 

1 

Unlock: 

HOVE 

( unlocklO, A3 ] , R1 

.'Criticality  2. 

XLATE 

Rl,ObjectXLATE,A2 

.-Find  the  object  and 

clear  its  locked  flag. 

DC 

-hdrLocked 

MOVE 

[objectHeader , A2 ) , R2 

AND 

R2, RO, R2 

MOVE 

R2, lob jectHeader, A2 I 

ROT 

R2, -hdrDeietedN, R2 

;lf  the  Object  was  marked  deleted,  dispose  it 

BF 

R2, *Unlk  Done 

MOVE 

Rl.RO 

CALL 

di sposeObject 

Unlk^Oone: 

SUSPEND 

msgUnlock 

*  unchecked 

1 Unlock<<of  fsetN 

now. 
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Cosmos  Listing 


Cosmos.m 


•  «  *1 

#1  MeChod  Manager  ## 

I*  «* 


Return  the  ID  of  a  method  aaiociated  with  the  given  ciaae  and  aeiector.  The  second  I 
entry  point,  lookupMethodU,  can  be  used  when  the  class  has  already  been  type'Checkea  i 
and  coerced  to  be  an  integer.  I 


Call:  lookupMethod 
Call :  lookupHethodU 

In:  RO  Class  (Tagged  TAGO : subCLASS  if  lookupMethod  is  used,  INT  if  lookupHethodU  is  used). 

R2  Selector  (Tagged  TACO:subS£L). 

Out:  R2  ID  of  method  or  NIL  if  none. 

Criticality  1 . 

Alters  R0-R3/AID2. 


LookupMethod : 

CHECK 

R0,TAG0,R2 

.•Criticality  6. 

BF 

R2, *LM  Halt 

.‘Make  sure  that  RO  is  tagged  as  a  class. 

WTAG 

RO, int7ro 

ROT 

RO, -subtagN, R3 

AND 

R0,caClassM,R0 

.-Coerce  it  to  an  integer. 

EQUAL 

R3, subCLASS,  R3 

BF 

R3,  •'LM  Halt 

LookupHethodU : 

ROT 

RO, CsClassN. R2 

.-Criticality  6.  Shift  the  class  to  the  high  16  bits- 

WTAG 

Rl, INT, R3 

AND 

R3,csS€iectorM,R3 

OR 

R2, R3, R3 

.Maxt?  a  c.ass/scAector  pair  in  R3. 

WTAG 

R3,  CS,  R3 

PROBE 

R3,  R2 

.-Get  the  cached  ID  of  the  method  into  RO,  if  there  is  one. 

BNIL 

R2, “LM  SendMsq 

MOVE 

FIP, IP" 

LM_^Halt : 

HALT 

haitClassType 

LM^SendMsg : 

MOVE 

FIP,  R2 

.•Criticality  6. 

MOVE 

R2.F 

.-Criticality  3 . 

MOVE 

R2, I  context  IP, Ai 1 

.-Save  KIP  in  the  context. 

MOVE 

ft3,  I  contextRO,  Al  I 

.-Save  the  class/selector  pair  in  RO  of  the  saved  context. 

MOVE 

RO,  R3 

AND 

Rl. [NodeHasR, A0i,R2 

DC 

TAGO : aubCLASS<<subtagN 

OR 

R0,R3,R3 

.-Generate  the  class  ID  in  R3. 

DC 

MSG  msgApplyFunct  ion*  ^ 

SEND20 

R2,R0 

.-Sena  a  LookupMetnco  message  asking  to  lookup  the  method  and 

MOVE 

101, R2 

.-send  a  reply  back  tc  tne  context. 

DC 

LLookupMechod 

SeNC20 

RO.Rl 

SENO2E0 

R3,R2 

MOVE 

NIL.Rl 

MOVE 

Rl,  1  contextNext, Al } 

.-The  LookupMethod  handler  will  return  the  method  via  a 

DC 

SaveStateID023-C*2» 

.Met.hodRepiy,  which  will  reply  into  the  R2  slot  of  the  context 

BR 

RO 

;There  is  no  need  to  save  the  data  registers  in  the  context. 

fltLookupMethod  -  IP : abs i fault i unchecked i LookupMethod<<of t setN 
f  ItLookupMethodU  •  IP  :  abs  l  fault  i  unchecked  i  Look-jpMcthodU<<of  f  setN 


I  Execute  a  MethodReply  message. 


MethodReply:  MOVE  [met hodReply ID, A3 J , RO  ;Criticality  3. 

XLATE  RO, objectXLATE, A1  ;Xlate  the  reply  context  into  Al  . 

MOVE  [methodReplyValue, A3 ] , R?  ;Get  the  method  ID. 

MOVE  (contextRO, AJ 1 , RO  .Enter  it  into  the  cache. 

ENTER  R0,R2 

MOVE  FALSE, RO  .-Turn  off  A3  queue  wraparound. 

MOVE  R0,0 

MOVE  I  context  IDO,  Al  ),  RC  .-Restore  the  address  arc  ID  registers. 

XLATE  RO. rcstoreXLATE, AO 

MOVE  [context  I D2, a: ) , RO 

XLATE  RO,  restoreXlATF,  *■> 

MOVE  :contextID3,AM,R0 

XLATE  RC, restOreXLATE, A3 

MOVE  ;  context  IP,  a:  IP  .  Resuxc  cjfr.p..tdt  .o:-. . 

msgMethodRep, y  «  MethodRepiy<<offsetN 
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•  *  •! 

##  i;tUUi«s  l» 
•  «  f  I 


Dlvld*  R1  by  RO .  Return  the  quotient  end  remainder.  The  magnitude  of  tne  remainder  ( 
ie  aiweye  lest  than  the  magnitude  of  the  divisor,  ana  the  sign  of  the  remainder  it  i 
the  tame  at  the  sign  of  the  divisor.  Halt  if  the  divisor  it  zero.  I 


Call:  divide 

In:  RO  Divisor. 

R1  Dividend. 


Out:  RO 

R1 


Quotient . 
Remainder 


Criticality  1. 
Alters  RO/Rl. 


Divide: 

HOVE 

MOVE 

CHECK 

BF 

CHECK 

BF 

BZ 

BNZ 

MOVE 

BR 

Div_DividendNZ : 

LT 

EQUAL 

MOVE 

BF 

EQUAL 

BT 

ADO 

BF 

SUB 

SUB 

Div_Normal : 

LT 

BF 

NEQUAL 

BT 

MOVE 

8T 

MOVE 

ADD 

BR 

Oiv^DivlsorNeq: 

NEC 

NEC 

NOT 

Dlv^DlvliorPos: 

BF 

NEC 

Ol v^OivldendPos 

;  MOVE 

MOVE 

CT 

MOVE 

MOVE 

BT 

FFB 

MOVE 

FFB 

SUB 

LSH 

MOVE 

BR 

Div_l,oop'. 

SUB 

MOVE 

ADD 

LSH 

Div_Loop^l : 

LT 

BT 

SUB 

ADD 

Dlv_Loop_2; 

MOVE 

BNZ 

Div  Done3: 

MOVE 

BF 

NEC 

BZ 

SUB 

SUB 

Dlv_Done2 ; 

MOVE 

BF 

NEC 

Div_Donel ; 

MOVE 

BF 

MOVE 

SUB 

BF 

ADD 

Div_Done : 

MOVE 

MOVE 

MOVE 

Div  Zero: 

HALT 

Div~Overf low : 

HALT 

R2, lTempDiv_R2, AO) 

R3, fTempDiv_R3, AOJ 
RO, 1NT.R2 
R2,  *‘Oiv_NonInteger 
Rl, INT,R2 

R2, “Oiv^Nonlnteger 
RO,  ‘‘Div_Zero 
Rl, •Div_DiviaendN2 
0,R0 

'‘Div_Done 
RO,  0,R2 

Rl , $80000000. R3 
R3, fTempDlv_«0000000, AO) 
R3,  '^Div  Normal 
RO, -1,  r5 

R3,''0iv  Overflow 

ri,ro,rT 

R2,  ''Dlv  Normal 
Rl , RO, Rl 
R1,R0,R1 
R1,0,R3 

R2, “Div  DivisorPos 
RO. $80050000, R2 
R2,“Oiv  OivisorNeg 
0,R0 

R3, “Div  Donel 
-I,R0  * 

Rl, $80000000, Rl 

''Div_Donel 

RO,  RO 

R1,R1 

R3,R3 

RS.^Div  OlvidendPos 
R1,R1  “ 

R2, FOPO 
R3,FOPl 
RO,  Rl. R2 
RO,  R3 
0,R0 

R2,  '■Div_Done3 
R1,R2 

R2, [TempDiv_Count, AO] 

R3,  R2 

R2, tlempDiv  Count, AO], R? 
R3,R2,R3 

R2,  'TempOl v_Count ,  AD i 
'*Div_Loop  1 

R2,  I,R2 

R2,  ; TempOiv_Count,  AO j 

RO, RO, RO 

R3,-1,R3 

Rl , R3, R2 

R2, “Div  LOOp_2 

ri,r3,rT 

RO, 1, RO 

( TempDi v_Count , AO  J , H2 
R2,  ‘*Div_Loop 

F0P1,R2 

R2, “Div_Done2 

RO,  RO 

Rl , “Div  Done2 
RC, 1 , R0“ 

R3,Ri,Ri 

FOPC, R2 

R2, "Div  Done! 

Rl ,  Rl 

(TempDiv  BOCOOOOO, AO ; , R3 
R3, 'DivSone 
FOPO, R2' 

RC. 1 , RC 

R2 , “DivDone 

RC. 2, RC 

[TempDiv  R2,AOi,R? 
[TempDiv'R3, AO : , R3 
f:p.  ip 

haitDivC 

haltOverflow 


:Criticality  8.  Save  R2  and  R3. 
.'Check  for  futures  and  bad  types. 


.'Halt  if  the  divisor  is  zero. 

;If  the  Dividend  was  zero,  return  a  zero  quotient  and 
: remainder . 

;R2  is  true  if  the  divisor  is  negative. 

.'TempOiv_8COCOOCO  is  true  if  the  dividend  was  $80000000. 


;In  this  case,  when  tne  divisor  is  -1,  the  division  overflows. 
.The  dividend  is  $80000000. 

;Wnen  the  divisor  is  positive,  add  it  to  the  dividend; 

.‘When  the  divisor  is  negative,  subtract  it  from  tne  dividend. 
;?he  reverse  adjustment  w;II  bo  made  on  the  quotient  later. 

;R3  is  true  if  the  dividend  is  negative. 

;7ne  divisor  is  -$80000000. 

.When  tne  dividend  is  positive,  the  quotient  is  -1. 

.-When  the  dividend  is  negative,  the  quotient  is  0. 


;If  the  divisor  was  negative,  negate  both  it  and  the  dividend. 


;If  the  dividend  is  now  negative,  neoate  only  it. 

.'Now  both  the  divisor  ano  the  dividend  should  be  positive 
; (and  no  greater  than  $?FFFFFFF»  . 

.'Move  the  divisor  to  R3. 

;From  now  on  RO  contains  the  quotient. 


; R2  contains  the  number  of  extra  bits  of  magnitude  in  the 
;dividena  over  the  divisor. 

;Shift  the  divisor  so  that  its  most  significant  bit  is  in  the 
;same  position  as  the  dividend's. 


.Shift  the  quotient  to  the  left  and  the  divisor  to  the  right. 
;Try  s'ubtracting  the  shifted  divisor  from  the  dividend. 

; I f  successful,  increment  the  quotient. 


the  divide."?;  was  negative,  negate  the  quotient; 
if  the  remainder  was  positive,  subtract  the  remainder 
from  the  div.sor  and  S'UDtract  one  from  the  quotient  to  keep  the 
remainder  positive . 


; J f  the  divisor  was  negative,  negate  the  remainoer. 


;Tne  dividend  was  SfiC'COCCJC.  Perform  the  quotient  adjustment. 
;Wher.  tne  divisor  was  positive,  subtract  1  from  the  quotient; 
.•When  the  aiv.scr  was  negative,  add  1  to  t.he  quotient. 

•  Restore  regis'^ers  anc  rei..rn- 


Div_Nonrnteqer  T  CHECK  R0.FC'T,R2 


eit.ner  opcra.'d  was  a  fut  .re,  cras.h  with  the  future  fault; 
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Div  Future: 


BT  R2, “Div  Future 

CHECK  R1,FUT,R^ 

BT  R2, -Div_Future 

HALT  HeitType 

HALT  haXtFuture 


.'Otnerwise,  crash  with  the  type  fault. 


fltDivlde  -  IP : abs I  fault  I  unchecked  I Divide<<offaetN 
fltCrashOverf low  -  IP : abs I  fault  I  unchecked  I Div_Overflow<<off$etN 


I  Allocate  and  initialize  a  new  closure. 

I 

(Call:  newClosure 

I 

I  In:  RO  First  word  of  object. 

I 

lOut:  AID2  '•Object, 

t  R1  Object's  ID. 

I 

ICrltlcallty  3. 

I 

■Alters  R0'R3/AID2. 


NewClosure : 


MOVE 

FIP, Rl 

;Crit;cdlity 

6. 

MOVE 

Rl,F 

.•Criticality 

3. 

MOVE 

Rl,  lTempNCl_^FlP,AOJ 

CALL 

allocNextOb3ect 

HOVE 

R0,R1 

DC 

(111  000100000100000 1  (callClosure&%I00000)«4icallClosure&l0nill)«17 

WTAG 

RO, 1NST3, RO 

.-Install  the 

faul ter  instruction . 

MOVE 

RO, 1 oClosureCode. A21 

MOVE 

{TempNCJ_FlP,A0J, IP 

f ItNewClosure 


IP:abs|faulti uncheckea i NewClosure<<of  f  setN 


.Criticality  S. 
.•Copy  AO  to  A2. 


.'Mask  the  length  of  the  object  pointed  by  A3 
:to  hArgs  words. 


.-Put  the  numoer  of  display  arguments  in  R3 . 
.-update  the  length  of  the  message. 

.-Send  the  message  back  tc  this  node. 

.-Send  the  real  function  . 
osure_2<<of i setK 
.-Override  the  Limit  fault. 

.-Send  the  rest  of  tne  arguments. 


I  Call  the  function  in  a  closure.  This  routine  does  not  return. 


ICall:  callClosure 

I 

ICriticality  0. 

! 

CallClosure:  MOVE 

MOVE 
DC 

MOVE 

OB 

MOVE 

OR 

XOR 

OB 

MOVE 

SUB 

AND 

SUB 

ADD 

MOVE 

SEND20 

SENDC 

DC 

MOVE 

SENOO 

SENOO 

SENOO 

SENDO 

SENDC 

SENDO 

SENDO 

SENDO 

SENDO 

SENOO 

SENOO 

SENDO 

SENDO 

SENDO 

MOVE 

CCl_SendRest :  SENDO 

ADD 
SENDO 
ADD 
SENDC 
ADD 
SENDO 
ADD 
BR 

CallClosure  2:  AND 
MOVE 
MOVE 
MOVE 
CALL 
SENDEO 
CALL 


AO,  R3 
R3,  A2 

MSG : msgApplyFunct ion 
(oFunctionNArqs, A2 ) , R1 
R0,R1,R0 
A3,  R2 

R2,  lengthM,R2 
R2,  lengthM, ft? 

R2,  Rl, R2 

R2,  A3 

ft3, 1, R2 

R3,  lengthM,R3 

R3,oClosurcDisplay,R3 

RO,  B3,  RO 

NNft,  Rl 

Rl,  RO 

! oClosureFunct , A2 1 
IP : abs I  unchecked  >  Cal iCi 
RO,  I LimitOver r ide,  AO ; 

12,  A3] 

{3, A3; 

14.A31 

13. A31 
[6.  A3] 
f'^,A3) 

(«, A3) 

(9.  A3! 

110,A31 
I  11, A3) 

[12, A3) 

[13, A3) 

!M,A31 
[15, A3! 

16, BO 
[RO,  A.1  I 
RO. 1 , RC 
1R0,A3: 

RO, I , BO 
|R0, A31 
RO, 1 , RO 
tRC, A31 
RC,  1 ,  RO 

“CCl  SendRest 
R2,  IcnqthM, R3 
[R3,A2),R3 
R2,  A2 
2,  RO 

biockSend 


.-Senas  more  argument 


.-Get  t.hc  .ast  cisp..ay  argument. 

.- Decreme.nt  tr:e  .onqtfi  of  the  display  by  one. 
.-Ser-.G  tne  display  .irq^ments. 


suspend 


fit CallClosure  -  IP:aDSj  unchecked i 


1  osur «»< < o f  {  sctN 
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«#  M 

44  Control  HanAgcr  44 

t4  44 

44444444444444444444444 


I  Execut*  An  Apply,  ApplyPunction,  or  ApplySclector  mosxAqe. 


Apply : 


Appl  y  Fund  i  on : 


ApplySoIcdor : 


AS 


1 : 


MOVE 

[ Apply Funct, A3] ,  R1 

;Ci\tlcAllty  0.  Get  the  funct. 

CHECK 

R1,TAC0,R2 

.’If  it  Has  tAq  0,  Assume  it  is  a  selector. 

BT 

R2.  '‘ApplySelector 

CHECK 

Rl.  I0,R2 

;If  it  hAS  tAq  10,  Assume  it  is  a  function. 

BT 

R2, •ApplyFunction 

HALT 

Ha It Apply 

.‘Otherwise  the  messAqe  was  InvAlid. 

MOVE 

i ApplyFunct, A31 ,  RO 

.‘CriticAllty  0.  Get  the  function. 

XLATE 

RO.objectXLATE.AO 

DC 

IP :oFunct ionCode<<of f setN  ;StArt  axecutinq  At  the  second  word  of  the  function 

MOVE 

RO,  IP 

MOVE 

(Applyfteceiver,  A3 ] ,  RO 

.-CriticAlity  0.  Get  the  receiver. 

PROBE 

RO.  R1 

.’Probe  it,  hoping  it  is  An  ID  or  DID. 

BHIL 

Rl, “AS  Miss 

MOVE 

Rl.  A2 

:If  so,  point  A2  to  the  instance  object. 

MOVE 

RO. ID2 

MOVE 

(objectKeAOer. A2] .  RO 

:£xtrACt  the  class  from  the  object  header. 

WTAG 

RO, INT.RO 

ROT 

R0,-hdrClAssN,R0 

AND 

RO, hdrClAssH,  RO 

MOVE 

I ApplyFunct,  A3 ] ,  Rl 

.’Get  the  selector. 

CALL 

lookupHethodU 

;RC  now  contains  INTrc.ass. 

DC 

IP  :  oFunct ionCode<<of  t 

setK  ;Co  execute  the  methoo. 

XLATE 

R2, objectXLATE,  AO 

MOVE 

RC,  IP 

CALL 

typeOf 

.’Call  tne  rea-  ciass-extractior.  routine. 

BR 

“AS  : 

josgApply  ■  Apply<<o 
msqApplyFunctlon  - 
msgApplySeloctor  - 


f  fsetN 

ApplyFunction<<of IsetN 
ApplySelector<<of  f  setN 
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It  Initialization 


I  Initialize  the  MDP . 


InitializeMDP:  DC 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

HOVE 

MOVE 

MOVE 

HOVE 

MOVE 

MOVE 

MOVE 

MOVE 

MOVE 

WTAG 

DC 

2MDP_ClrGioba:a;  SUB 
MOVE 
3NZ 


AOOB : Inva i ia 

BO,  AO 

BO,  AI 

BO.  A2 

BO,  A3 

BO. AOB 

BO. AIB 

BO.  A2B 

BO. A3B 

BO. AOB 

BO. AIB 

BO. A2B 

BO. A3B' 

NIL.  BO 
BO. IDO 
BO. IDl 
BO. ID2 
BO, 1D3 
BO. IDOB 
BO. xDlB 
BO. ID2B 
RO. 1D3B 
RO, lOOB 
RO, IDIB 
RO, IU2B 
BO. ID3B 
-l.Ri 

ri.cfut.b: 

64 

BO.  3.R0 
Rl, [BQ.AC; 

RO, •IMDP  ClrClobals 


.-Clear  cne  user  aaoijss  and  ID  regiatera. 


; Clear  ai*  qiooals  to  CFUT:-l. 
;R]  contains  CFUT;-l. 


DC  AODR ; Oueuel Start <<DaseK  ::nitia.ire  the  queues 

MOVE  BC.OHL 

DC  ADDR:OjeuelStart<<fcascN.  i Queue lEna-Queue St art-i  > 

MOVE  RQ.OBM 

DC  AODR :QueueOStart <<oa8eN 

MOVE  RO.QHL 

DC  AODR:OueueCStart<<Dases  !  tOueueCEnd-Q-ue-ueCSt art- 1 ) 

MOVE  RO.OBM 


IMDP  ClrXlate: 


IMDP  ClrBrat : 


IMDP_ClrHeap: 


IMDP  MaKeFaat : 


MOVE 

DC 

MOVE 

MOVE 

DC 

MOVE 

MOVE 

DC 

SDB 

MOVE 

BNZ 

MOVE 

DC 

MOVE 

DC 

SUB 

MOVE 

BNZ 

DC 

MOVE 

MOVE 

MOVE 

DC 

MOVE 


NIt,R2  ;R2  contains  NIL. 

ADDR ; XlateStart<<oasoN i (XlateEnd-XlateStar t-l > 

R2, { Liml tOver r ioe, AO  ; Initialize  Limi tOverride . 

RO.TEIH  .-Initialize  tne  xlate  table. 

ADDR ;XlateScart<<baseN 

R2,  [  FastContextOueue.  AO  ;  .-Initialize  FaatContextOueue. 
RO,  A2 


IF  ‘FASTS  I M 


XlateEna-XlateStart 
BO, 1 , BO 
R2,  fSO,  A2! 

RC. 'IMDP  ClrXlate 
R2,  [BRATFree.AO] 

AODR : BRATStart<<baseN 
RO,  A2 

BRATEhd-BRATSt art 
RC, 1 . RO 
R2.  IRO, AZ; 

RC,  ' lMDP_ClrBrat 
FixediieapStart 
RO,  [ Heapstart , AO . 

RO,  [FlrstFree, AO) 

RO,  R3 
MeapEnd 

RO.  I LastFree. AC  I 


MOVE 

ADD 

GE 

8F 

END 

MOVE 

MOVE 

DC 

MOVE 

AND 

ROT 

AND 

ROT 

O.B 

ROT 


OR 

MOVE 

DC 

OH 

MOVE 

MOVE 

MOVE 

MOVE 

SUB 

MOVE 

ROT 

MOVE. 

OR 


Rl, (R3,A0; 

R3, ;,R3 
R3, RO, R2 

R2, “IMDP  Clrheap 


.'Clear  every  entry  in  tne  table  to  NIL. 
; Initialize  BRATFree . 

;Clear  the  BRAT. 

.'Initialize  tne  neap. 

•Clear  tne  heap  to  cruT:-l. 


: :  n  i  t  i  a  . 


Ire  Rancorr.Seec,  ana  SeriaiNode. 


Ca-cr.,;atc  this  node’s  serial  number  from  tne  NNR  value. 


NNR, R2 

R2,  I RanaomSecd, AL 
nodeMa  s  ^ 

RO, (NodeMasN, A3 
R2, xM. R3 
R?, -yN, «G 
RC, yM, RC 
RC,  xL,  RC 
H3,  RC,  R3 
R?, -7N, RO 
RO, zM, RC 
RO,  xl  ♦yL,  RC 
R3,  RO,  R3 

R3,  i Se r i a  1  Node, AC 

ID;  <  rFa  St  Coni  ext -  1 )  <<  scr  1  a  1  N 

R0,R2,BC'  ;  I  n  i  I  i  .  i /e  l.a  st  Ob  feet  I D  and  Next  D  i  St  ob  f  I D 

RO,  [La  stObfec-t  I  L,  At  ' 

0,  RC 

RO,  [  Next Di St  Ob ) I D,  A 
nF  a  stContex t  s ,  R3 
R3, 1 , R3 

R3,  [TempINITM  Co’'iext,AC 
R3,  ser  ia  IN,  R'. 

NNR,R3  .-Put  tr.e  r.ose  number  :.-to  the  context  ID. 

Rl , R3, R  I 


;K<iKe  r-.E  a  St  fen-^ext  s  fast  contexts. 

.Save  the  number  of  fast  contents  yet  to  be  made. 


Cosmos.m 
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WTAC  r: .  ID. r: 

DC  OBJ  :  nor Locked  I  horFa  ftt  >  corrteMis: 

CALL  ellocODject 

XOR  RQ. rei.RO  ;Heke  the  test  context  ADLRs 

ENTER  Rl.RO 

CALL  enter&indinq 

MOVE  ( Feat Con text Queue.  AO J .  RO 

MOVE  RC. icontextNext, A2 i 

MOVE  1D2.ro 

HOVE  RO. ( FastContextOueue, AC ; 

MOVE  !Temp:NITM_Context,A01,R3 

BNZ  R3.  ‘'IMDP  MelceFast 

MOVE  RO.IDIB  .-initialize  priority  0‘a  AID 

MOVE  A2,R1 

MOVE  Rl, AlB 

HOVE  IcontextNext. A2; .RO 

HOVE  RC. IFa stContextQueue, AO ; 

MOVE  [Fir atFree. AO J . RO 

MOVE  RO. [HeapStart. AO] 

MOVE  FALSE, RO 

MOVE  RO. 1 

IHDP_Backgroun(l : 

IF  ; REALMODE 
STOP 
END 

BR  “IMDPBacltground 

OSEnd: 

FixedHeapStart : 


.-The  real  neap  atarca  after 
.-Enable  message  reception. 

;Do  nothing  in  the  oacxgroun 


ORC 

iPriority  0  faults: 


FaultsOStart 


.-Priority  1  faults 


.-System  calls: 


DC 

f ItCrashO 

CATASTROPHE 

DC 

f ItCrashO 

INTERR'JPT 

DC 

f ItCrashO 

QUEUE 

DC 

f 1 tSend 

SEND 

DC 

f ItCrashO 

I LC INST 

DC 

f ItCrashO 

DKAMF.RR 

DC 

fitIKVADR 

IKVADH 

DC 

f ItCrashType 

ADRTYPF 

DC 

fltLlmit 

LIMIT 

DC 

fltEarly 

EARLY 

DC 

f ItCrashO 

MSG 

DC 

fitXLATE 

XLATE 

DC 

fltCrashDverf low 

CVEHf  LO« 

DC 

f  ItCFL’T 

CFUT 

DC 

tl tCrashFuture 

FC? 

DC 

fltCrashType 

TAC8  -  ID 

DC 

f ItCrashType 

TAC^  -  DID 

DC 

fltCrashType 

TACA 

DC 

fltCrashType 

TAGS  -  FLOA 

DC 

fltCrashType 

TYPE 

DC 

fltCrasnC 

$14 

DC 

fltCrashC 

Sib 

DC 

fltCrashC 

$16 

DC 

f  ItCrashO 

SI*? 

DC 

f  ItCrashO 

$18 

DC 

f ItCrasnO 

$1 

DC 

fltCrashC 

$;a 

DC 

fltCrashO 

$1B 

DC 

fltCrashC 

$1C 

DC 

fltCrashO 

$1D 

DC 

f ItCrashO 

Sit 

DC 
ts  : 

fltCrashO 

SIF 

DC 

fltCrashl 

catastrop.se 

DC 

f ItCrashl 

INTERRUPT 

DC 

f  llCrasr.I 

QUEUE 

DC 

f ItCrashl 

SEND 

DC 

fltCrashl 

ILGIKST 

DC 

fltCrashl 

DRAMERR 

DC 

fltCrashl 

INVADR 

DC 

fltCrashl 

ADRTYPl 

DC 

fltCrashl 

LIMIT 

DC 

fltCrashl 

EARLY 

DC 

f ItCrashl 

MSG 

DC 

fltCrashl 

XLATE 

DC 

fltCrashl 

OVERFLOW 

DC 

fltCrashl 

CrjT 

DC 

fltCrashl 

PUT 

DC 

f ItCrash 1 

TAC8  -  ID 

DC 

fltCrashl 

TAGS  -  DID 

DC 

fltCrashl 

TAGA 

DC 

fltCrashl 

TAGB  -  KLOA 

DC 

fltCrashl 

TYPE 

DC 

fltCrashl 

$14 

DC 

fltCrashl 

Sib 

DC 

f ItCrasnl 

$16 

DC 

fltCrashl 

SIT 

DC 

fltCrashl 

518 

DC 

fltCrashl 

519 

DC 

f  1 1 C  r  a  s  h 

SI  A 

DC 

f ItCrashl 

SIB 

DC 

f  i  t  C  r  a  8  h  1 

Sic 

DC 

f . t Cra  sh  1 

SID 

DC 

fltCrashl 

SIF 

DC 

f .  t  C  r  a  s  h. 

SIf 

DC 

f i t  Swspend 

St: 

DC 

f ItBIocxMove 

su . 

DC 

llBIocnSrr.d 

St? 

DC 

f ItCompactMeap 

SI.  3 

DC 

f I t a: . jcOb^ect 

SS4 

DC 

f  1 1  Ent  erB  t  r.ding 

s:  b 

DC 

f 1 1  LookupB: ndi nq 

SIH 

DC 

f 1 tDe let eBi ndinq 

sc 

DC 

fit  PurqeB: nd i nq 

SL8 

nonrelocatable . 


to  a  fast  context . 


tne  fast  contexts. 


a  mode. 
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DC 

i  i  iNewLoca 

$C9 

DC 

f 1 LAiiocNext Object 

$0A 

DC 

f ItAlIocNewOb ject 

50B 

DC 

fitNewContext 

$0C 

DC 

fltDisposeContext 

500 

DC 

f itOi sposeObject 

$0E 

DC 

f it  Deal! oca teOb ject 

SOF 

DC 

f itNewOb ject 

$10 

DC 

f ItCIassOf 

511 

DC 

fltTypeOf 

51? 

DC 

fltObjectNode 

513 

DC 

f itPreferredConst Ituent 

$14 

DC 

f  ItCo 

$1$ 

DC 

f It LooKupHethod 

$16 

DC 

f 1 tLookupHethodU 

$17 

DC 

f ItDivide 

$15 

DC 

f ItNewCiosure 

$19 

DC 

fit  CallClosure 

$1A 

DC 

fltCrashCall 

SIB 

DC 

f ItCrashCa 1  i 

SIC 

DC 

f ItCrashCal 1 

SID 

DC 

fltCrashCail 

SIE 

DC 

f itCrashCall 

$1F 

DC 

1 1 tCrashCa  1 1 

$2C 

DC 

fltCrashCall 

$21 

DC 

f ItCrashCa  1 1 

$22 

DC 

f ItCrashCall 

$23 

DC 

f 1 tCrashCa  1 1 

$24 

DC 

f i tCrashCa  1 « 

$2$ 

DC 

f ItCrashCall 

$26 

DC 

f ItCrashCall 

$27 

DC 

f ItCrashCall 

$28 

DC 

f 1 tCrashCal  1 

529 

DC 

f 1 tCrashCa  l  - 

:2A 

DC 

fltCrasnCa*. 

S2B 

DC 

f  *  t Cra  shCa  i 

S2C 

DC 

!  ItCrashCa  '. 

S2D 

DC 

f  1 1  Cr  a  sr.Ca  1 . 

S2E 

DC 

f ItCra  shCa  .  1 

S2F 

DC 

f ItCrashCa-  . 

530 

DC 

i ItCrashCa - i 

S3: 

DC 

! ItCrashCa.  . 

532 

DC 

fltCrasnCa.. 

$33 

DC 

f ItCrashCa  1  : 

$34 

DC 

f  1 1 Crash.Ca .  : 

S3b 

DC 

f  ItCrashCa . 

536 

DC 

f i tCra  ShCa : : 

S3  • 

DC 

f . tCra  ShCa  .  . 

538 

DC 

f  ItCrashCa  . 

339 

DC 

f it Cra  ShCa  . 

S3A 

DC 

I ItCrashCal . 

53H 

DC 

f-tCrashCa  '.  I 

53C 

DC 

f ItCrashCa  ii 

S3C 

DC 

i . tCrashCa  l 

531 

DC 

f .tCrashCa  . 

S3F 

END 


BREAK  HAZARDS 
IF  :realmcce 

BREAK  FAOLT  Fat- 1 1  sOSia  r t ,  Fau U  s  *.  S'.a ri 
BREAK  READ  WRITE  OSS t a  rt.  .  ,  OSErtO- *. 

BREAK  FETCH  0. .OSStart 
IGNORE  FETCH  5400, 54CI 

BREAK  READ  WRITE  Fault s CSt  j r;  .. Ca 1  sEno- 
STEP  300 

BREAK  FETCH  $400. $401 
BREAK  pS'*'',  WRITF  C  ,  ,  3 
END 


:HredK  cr  nazaros. 

Brean  or.  catastropric  faults. 

Protect  operating  system  cooe. 

GloPals  cannot  ce  executed. 

Tnc  in  it  :a  i  i  zat  ;cr.  cooe,  however,  can. 

Fau.t  vectors  are  protected. 

AliCw  're  operating  system,  tc  write  glcoals. 

Tr.e  ir.i  t  iai  i  zat  ior  ccoe  is  now  gone. 

Locat.cr.s  0  thrc-gr  3  are  not  used  for  anything. 


INCLUDE  “Punt  ime  .  T'* 


;  Lcaa  the  r.:n-t  i-o  syste.m. 


IF  ;real.hodf 

RUN 

END 


. :  re  f'e  operating  syste.r. 
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Runtime.m 


MOP  Operating  Systeff, 

version  2.3 

written  by 

Vialdemar  Horwat 

Master*!  thesis  under  Prof.  Wiillare  Dally 

March  2$,  1989 

May  1991 

Send  problems  and  comments  to 
waidemarfhx. Ics.mit .edu. 

Copyright  1989,  1990,  1991  Waldemar  Horwat 

:The  download  header  is  appended  to  the  peqlnninq  of  every  module  that  Is  downloaded. 


HODUL£ 

OOWNLOADHEADER 

DC 

MSC : msqAcceptOb ject 1 *2 

DC 

lONODE 

END 

MODULE 

LooKupMethod 

vCuzrentClass 

-  € 

;Class  number  of  superc.ass  cuxror.tly  scann<*c. 

BeglnO : 

DC 

OBJ  :  hdrCopyable  ;  class.^'unct  i«.n<<ndrClassN  .•F.noC-Beqir.C 

DC 

1 LookupMethod  t 

DC 

5 

MOVE 

( lookMetnSe lector ,  A3 ) ,  RC 

A  LATE 

RO,  objectXLATE. A2  .Point  a:o2  to  tnc  selector  object. 

MOVE 

( lookMetn.Ciass,  A3  1 ,  R2  .  Store  tne  class  in  R3. 

MOVE 

1 lookMethRepiy I D, A3 ! , RO  .  Save  tne  reply  ID  in  tne  context - 

MOVE 

RO, [lookMetnReplylD. A1 ! 

MOVE 

FALSE, RC  .Turn  off  A3  queue  wrapa  ro.r.a  . 

MOVE 

RO.Q 

MOVE 

RO, [ vCur rentcl ass, Al ; 

MOVE 

(oSelKMetnods,A2!.R2 

MOVE 

oSelHetnods,  ri 

82 

R2, “Miss: 

searchl : 

EO 

R3,  (Rl ,  A2  1 ,  RO  .-SoarcJi  tne  c.ass/metnoo  associations  lor  the 

BT 

RO, “FoundHetnod  :ciass  in  R3. 

ADD 

Rl , 2.R1 

SUB 

R2, 1,R2 

Bnz 

R2, “Searchl 

Missl : 

XLATE 

R3,  Ob  jectXLATE,  A3  ;If  no  association  was  founo,  scar,  tne  class's 

MOVE 

0,R0  ; superclasses . 

Hiss2: 

ADD 

RO, 1,R0 

C£ 

RO,  [oClassNAllSuper s,  A3 ; ,  Rl 

BT 

Rl,“MlssAli  .-Return  Nil.  if  an  associat  ior-.  still  wasn't  found 

MOVE 

RO, (vCurrentCiass, A1 ! 

ADD 

RO,oClassAl isuper  s,  RC 

MOVE 

fRO, A3) , R3 

MOVE 

(oSelNMethods, A2) , R2 

MOVE 

oSelMetnods, Rl 

Search2 : 

EO 

R3,  I  Rl ,  A2  1 ,  RO  .-Search  the  class/methoc  asscciaticns  for  tfte 

BT 

RO, “FoundMetnod  ;ciass  ir  R3. 

ADD 

R1,2,R1 

SUB 

R2, 1,R2 

BNZ 

R2, “Search2 

MOVE 

(vCurrentClass,Alj,HO 

BR 

“Mi ss2 

MissAll : 

MOVE 

NIL,R2  : No  method  was  lounc,  so  ret-rn  NIL. 

BR 

“FoundMcthod? 

FoundHethod : 

ADD 

R1,1,R1  .-Extract  tne  iretnoc  ID 

MOVE 

|R:,A2i,R2 

FoundMethod2 ; 

MOVE 

1 lookMethReplylD, a: , , Rl 

DC 

MSC  ;  msqwet  hodHep  1  y  ♦  3  .-Returr;  a  reply  <^«*Siiaqe  wxtr  the  method  ID. 

SEND20 

Rl .  PO 

SEND2E0 

SUSPEND 

R1,R2 

FndO : 

END 

.-NewDistobj  messaqe: 
LABEL  newDistobjClass  • 

2 

LABEL  newOi Btobjsi /e  - 

3 

LABEL  newDistob  )Ret  ..r  r, ' 

D  -  < 

LABF.L  newDl  St  Ob  jRet r  nS 

let  -  b 

LABEL  newDlstoB jTerp  - 

6  . Tempera  r  y 

M'JD'JIF. 

f  New  0 : si oL  ' 

Beqih;  DC 

OBJ  ;  hdr  repy-t  u.  e  C.assr 

'.  ,  1*  .  <•  ar 

or 

1  f  New  D  i  St  ot;  ‘  > 

DC 

6 

MOVE 

NNP, R  1 

EQUAL 

R  1 , 0 .  R  1 

BT 

R 1 , 'OhNodeC 

DC 

MSG  ;  msqAppl  yf  .inrit  ;on  6 

SE.NL20 

0,  RC! 

If  h'-*.  .  I 

or  *.a  r  a  t 

T; .  s  ressjqe  tc  nooe  C 

DC 

{  f  New  Cistob; 1 

SENOC 

RC 
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Cosmos  Listing 


Rnr.time.in 


OnNodeO : 


PositiveSize : 


End: 


SENDO  I newDl stob jClass, A3] 

SENOO  I newDi stobjSize^ A3J 

SENDO  [ncwDistobjReturnlD, A3] 

SENDEO  [nffwDlalobjReturnSiot, A3) 

SUSPEND 

MOVE  InewDiatobjSize. a31.ro  :Put  nax(size.l)  into  RO. 

CT  RO.O.Rl 

BT  Rl. “PoiitiveSize 

MOVE  1,R0 

SUB  RO.l.RO  .'Calculate  Iq  (nax  (size,  1 ) )  and  store  it  in  RO. 

FFB  RO.Rl 

MOVE  31, RO 

SUB  RO.Rl, RO 

ADD  RO - ‘LoqNNooes, R1  ;R1  contains  -stride. 

NEC  R1,R3  :R3  contains  stride. 

MOVE  -1,R2 

ASH  R2,R3.R2 

NOT  R2,R2 

DC  AD0R:64  ;Point  A2  to  the  global  area. 

MOVE  R0,A2  ^Criticality  2. 

MOVE  (RandomSeed.A2).R3 

ADD  R3,3,R3 

MOVE  R3, (RandomSeed, A2] 

AND  R2,R3,R2  ;R2  contains  the  offset. 

MOVE  {NextDistobjID,A2),R3  :Get  the  ID  for  this  distributed  object. 

MOVE  -1,R0  .-Advance  the  ID  counter  by  the  number  of  constituents 

ASH  RQ.Rl.RO  .'per  node. 

NOT  RO.RO 

ADO  RO.  l.RO 

ADD  R3,R0.R3 

MOVE  R3. [NextDistobjlD, A21 

SUB  R3,R0,R3  .Criticality  1. 

ROT  R3. serialN. R3 

OR  R3.  di stob  jMember ,  R3  .-Calculate  the  DID  for  this  distributed  object  and  store  it 

OR  R3, R2. R3  ;in  H3. 

NEC  R1.R2 

ASH  R2, loqStrideN.  R2 

AND  R2, ( l<<loqSt r iaeN*loqSt rideLi -1 . R2 

OR  R3,R2,R3 

WTAG  R3,DID.R3 

MOVE  UogNNodes, R2 

ADD  R2,R1,R?  ;Fut  iq (max ( si ze, 1 ) »  into  newDistobjTemp. 

MOVE  R2. [newDi stobjTemp. A1 ; 

MOVE  C,R0 

MOVE  R3,R1  .'Send  a  newDi stoc jTree  message  to  the  node  that  will  contain  the 

CALL  CO  .first  constituent  of  the  distributed  object . 

DC  MSG : msqApplyFunct ion  I ^ 

SEND20  Rl.RO 

DC  { fNewDistobjTree) 

SENOO  RO 

SENOO  ;newOistobjc:a$s.A3] 

SENC20  inewDi stob jS: re, A3 | , R3 

SENDO  0 

SENDO  inewOi StobjTemp. A1 1 

SENOO  1  newCi stob jReturnlO. A3 i 

SENDEO  (newDisto&jRet uf nSlot,  A3 ; 

SUSPEND 

END 


REF  REV  f_New_DistObj  -  ID  :  ( 1  <<30  i  (  UmXK<sX  l  (  Uff-Y  K<sy  1  (  :  4m7.  K<sZ  :  (  Uff‘S)<<sS  ) 


.’NewDistobjTree  message: 

LABEL  newDistobjTreeclass  •  2 
LABEL  nevOistobjTreesire  -  3 
LABEL  newOistobjTreelD  -  4 
LABEL  newDistobjTreeStart  -  S 
LABEL  newDistobjTreeLoqDelta  -  6 
LABEL  newDlstobjTreeReturnID  •  T 
LABEL  newDistobjTreeKeturnSlot  -  8 

MODULE  f NewDi stob jTree 

Beqin2  :  DC  OBU  :hdrCopyabie'classf’  unct  ion<<r:drClas5N  :  Lnd2-Beqin2 

DC  I f NewDi stob jTree ) 

DC  9 

MOVE  I  newDi  stob  jTree  LoqDel  t  a.  A3  ,  H  j 

BZ  R3, "Leaf 

SUB  R3. 1 . P3 

MOVE  NNR.RI 

DC  MSG : msqApplyf  unct ion . 9 

SEND20  Rl.RO  ;Caii  f NewD ; s tob ITr ee  twice,  each  time  on  half  of  the  range 

DC  i f NewDi stob jTree f  ;of  constituents. 

SENDO  RO 

SF.NDO  I  newDi  stocjTreeCiass,A3: 

SENDO  inewDistobjTreeSize, A3 i 

SENDC  inewDi stobjTreelD, A3; 

SEND2C  ;  newDl stobjTreeStart.A3;,R3 
MOVE  9,R0 

SEND2F.0  IcontextID,  a:  :  ,  RC 

WTAG  CKUT.  RC  .Maxe  a  cf..ture  in  RC  . 

MOVE  BO,  I  9.  A1  j 


MOVE 

: newDl 8tobjTrefID,Ai;,Rl 

MOVE 

‘ newDi St  Ob jTr restart , A3 ; 

MOVE 

:,R2 

ASH 

R2, R3, R2 

ADD 

HO, RP, RC 

MOVE 

RO, ; : 1 , a: : 

CALI 

CO 

DC 

MSG:msqApply?-  ..nrt  ion  9 

SF.n:;7& 

M  :  ,  RC 

DC 

'  f  New!. .  St  ..o  It 

SENDC 

HC 

SENDC 

1  newD  i  St  ot  jT  r  eeC  1  a  5S ,  A.! 

SEN.DC 

,  r.ewD  i  St  Ob  ;T  reeS  1  /r,  A  j  ' 

SENDC 

;  newDi St cb ;T  tee  ID,  A3 

SENDPC 

,  1 1 , Ai i , R3 

MOV! 

1  C,  BO 

SEND7EC 

;  cent  ext  1  D,  A I  ; ,  B(. 
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WTAG  RO.CFUT.RO  .Make  a  efuture  in  RO. 

MOVE  RO, 110,A1] 

MOVE  [9,A11,R0  .‘Force  the  efuture*. 

MOVE  [10,A1],RO 

BR  "Return 

Leaf:  MOVE  (newOleCobjTreeClaea, AJ],  RO 

XLATE  RO, objectXLATE, A2  .Get  the  header  word  tor  objeett  of  thia  claae  and  put  it  in  R3. 

HOVE  loClaisWord,A2],R3 

MOVE  ( newDistobjTreelD, A3 } . R1 

MOVE  { newDistobjTreeStart , A31 ,  RO 

CALL  CO 

MOVE  R3,R0 

CALL  allocNewObject 

HOVE  (newDistobjTreelD, A3), RO 

HOVE  RO, (oDiatob jCroup, A2 ]  :Initialize  the  qroup,  index,  and  logical-limit  instance  variables. 

HOVE  [newDistobjTreeStart, A3], RO 

HOVE  RO, (o0istobjlndex,A2] 

MOVE  [newuistobjTreeSlze, A3],R0 

MOVE  RO. [oDistobjLoqicalLimit, A2) 

Return:  MOVE  [newDi stobjTreeRet urnlD, A3 ] , R3 

BNIL  R3, "GoSuspend 

DC  MSG :fflsgReply 1 4 

SEND20  R3,R0 

SENDO  R3 

SENDO  (newOistobjTreeReturnSlot , A3 ] 

SENDEO  [newDistobjTreelD, A3) 

GoSuspend:  SUSPEND 

End2 : 

END 

REF  REV  fNewOistobjTree  »  ID:  ( 1<<30  (  (2«mxi  «sX  i  <2AfflY)  <<sr  ((2affl2)  <<s2  M2<trS/ cejS) 

DOWNLOAD  LookupHethod 
DOWNLOAD  f_New  Distobj 
DOWNLOAD  fNeuDlstob jTree 
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!  131,  132,  146,  168 

Ar^gument  14, 140 

Cons  15 

#’  132,  133 

Evaluation  Order  145 

Constant  135 

”  133 

t>assing  Convention  140 

Expression  135 

#:  132,  133 

Array  152, 164 

Predefined  133 

#\  133 

Boolean  164 

Constituent  158 

#L  168 

Integer  164 

Number  158 

%  131 

Ash  162 

Context  Future  144 

&  131,  132 

Become  129 

Continuation  141, 150, 151 

&cas-er  138 

Begin  17, 135, 144, 149, 170 

Copy  156 

&cwriter  138 

Block  131, 150 

Cput 148 

&guard  168 

BNF  130 

Cset  17, 18, 147, 148 

&iinTnutable  138 

Body  144 

Multiple  Value  148 

&inline  138, 139, 140,  141, 

Boolean  152 

Cwriter  Method  138 

147, 148 

Boolean-Array  152, 164 

Declare  169,  171 

&name  140 

Busy? 163 

Deep-copy  156 

&no-Ieak  140 

CAP  149 

Deep-dispose  156 

&non-strict  141 

Car  15 

Defclass  15,  131, 137 

&non-strict  144,  145 

CAS  147, 149 

Defconstant  131, 135 

&not-inline  138,  140, 141, 

CAS-er  Method  138 

Defglobal  131, 135, 170 

147,  148 

Cdr  15 

Define  131, 136 

&not-in1ine-default  139 

Cfuture  19, 144 

Defmacro  21,  168 

&predicate  138 

Semantics  144 

Defmethod  131,  135,  143,  150 

&reader  138 

Char-ready?  167 

Defparameter  135 

&side-effect-free  141 

Character  133, 152 

Defselector  131, 135,  142 

&value  140, 141 

Class  15, 133, 137, 152 

Defun  14, 131, 135, 142,  150 

&WTitcr  138 

Abstract  130 

Delete-Dead-Defs  172 

'  132 

Assertion  146 

Delete-Locals  173 

*  161 

Built-in  137, 152 

Delete-Moves  172 

+  161 

Hierarchy  153 

Delete-Touches  172 

- 161 

Immutable  138 

Describe  171 

/161 

Inheritance  137 

Detailed-Progress  171, 173 

//161 

Inline  139 

Display  167 

:  137,  140,  141,  147,  148 

Inquiry  156 

Display-print  167 

141 

Metaclass  152 

Display-stream  167 

;  15,  134 

Predicate  15, 138 

Dispose  156 

<  160 

Class-kind?  157 

Distarray  20 

<=  160 

Class-of  156 

Distobj  20, 152,  158 

<>  160 

Clet  17, 18, 131, 147 

Distributed  Object  20, 152, 

=  160 

Multiple  Value  148 

158 

>  160 

Co  158 

Creation  158 

>=  160 

Collection  152 

Distributed-Class  152 

?  131,  132,  149,  168 

Comments  15, 134 

End-of-file  133,  166 

@ 132, 168 

Common  Lisp  21,  129, 131, 

Eq  160 

[]  148 

140,  168 

Error  169 

\  133 

Compact-DCs  173 

Evaluation  Order  145 

131 

Compact-Sends  173 

Exit  150 

_ 131,  140, 147 

Compact-Vars  172 

Expression  144 

Abstract  Class  130 

Compile  171 

Constant  135 

Abstract  Method  130 

Compiler  Option  169 

False  133, 152 

Acquire  163 

Complex  Numbers  131 

Fast-Apply  173 

Add-method  143 
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